This function retrieves toplevel sequences. These sequences correspond to genomic regions in the genome assembly that are not a component of another sequence region. Thus, toplevel sequences will be chromosomes and any unlocalised or unplaced scaffolds.
Usage
get_toplevel_sequences(
species_name = "homo_sapiens",
verbose = FALSE,
warnings = TRUE,
progress_bar = TRUE
)
Arguments
- species_name
The species name, i.e., the scientific name, all letters lowercase and space replaced by underscore. Examples:
'homo_sapiens'
(human),'ovis_aries'
(Domestic sheep) or'capra_hircus'
(Goat).- verbose
Whether to be chatty.
- warnings
Whether to print warnings.
- progress_bar
Whether to show a progress bar.
Value
A tibble
, each row being a toplevel sequence,
of 4 variables:
species_name
Ensembl species name: this is the name used internally by Ensembl to uniquely identify a species by name. It is the scientific name but formatted without capitalisation and spacing converted with an underscore, e.g.,
'homo_sapiens'
.coord_system
Coordinate system type.
toplevel_sequence
Name of the toplevel sequence.
length
Genomic length toplevel sequence in base pairs.
Examples
# Get toplevel sequences for the human genome (default)
get_toplevel_sequences()
#> # A tibble: 194 × 4
#> species_name coord_system toplevel_sequence length
#> <chr> <chr> <chr> <int>
#> 1 homo_sapiens scaffold KI270757.1 71251
#> 2 homo_sapiens scaffold KI270741.1 157432
#> 3 homo_sapiens scaffold KI270756.1 79590
#> 4 homo_sapiens scaffold KI270730.1 112551
#> 5 homo_sapiens scaffold KI270739.1 73985
#> 6 homo_sapiens scaffold KI270738.1 99375
#> 7 homo_sapiens scaffold KI270737.1 103838
#> 8 homo_sapiens scaffold KI270312.1 998
#> 9 homo_sapiens scaffold KI270591.1 5796
#> 10 homo_sapiens scaffold KI270371.1 2805
#> # ℹ 184 more rows
# Get toplevel sequences for Caenorhabditis elegans
get_toplevel_sequences('caenorhabditis_elegans')
#> # A tibble: 7 × 4
#> species_name coord_system toplevel_sequence length
#> <chr> <chr> <chr> <int>
#> 1 caenorhabditis_elegans chromosome I 15072434
#> 2 caenorhabditis_elegans chromosome II 15279421
#> 3 caenorhabditis_elegans chromosome III 13783801
#> 4 caenorhabditis_elegans chromosome IV 17493829
#> 5 caenorhabditis_elegans chromosome V 20924180
#> 6 caenorhabditis_elegans chromosome X 17718942
#> 7 caenorhabditis_elegans chromosome MtDNA 13794