Skip to contents

This functions retrieves details about the assembly of a queried species.

Usage

get_assemblies(
  species_name = "homo_sapiens",
  verbose = FALSE,
  warnings = TRUE,
  progress_bar = TRUE
)

Arguments

species_name

The species name, i.e., the scientific name, all letters lowercase and space replaced by underscore. Examples: 'homo_sapiens' (human), 'ovis_aries' (Domestic sheep) or 'capra_hircus' (Goat).

verbose

Whether to be chatty.

warnings

Whether to print warnings.

progress_bar

Whether to show a progress bar.

Value

A tibble, each row being a toplevel sequence, of 4 variables:

species_name

Ensembl species name: this is the name used internally by Ensembl to uniquely identify a species by name. It is the scientific name but formatted without capitalisation and spacing converted with an underscore, e.g., 'homo_sapiens'.

assembly_name

Assembly name.

assembly_date

Assembly date.

genebuild_method

Annotation method.

golden_path_length

Golden path length.

genebuild_initial_release_date

Genebuild release date.

default_coord_system_version

Default coordinate system version.

assembly_accession

Assembly accession.

genebuild_start_date

Genebuild start date.

genebuild_last_geneset_update

Genebuild last geneset update.

Examples

# Get details about the human assembly
get_assemblies()
#> # A tibble: 1 × 10
#>   species_name assembly_name assembly_date genebuild_method golden_path_length
#>   <chr>        <chr>         <chr>         <chr>                         <dbl>
#> 1 homo_sapiens GRCh38.p14    2013-12       full_genebuild           3099750718
#> # ℹ 5 more variables: genebuild_initial_release_date <chr>,
#> #   default_coord_system_version <chr>, assembly_accession <chr>,
#> #   genebuild_start_date <chr>, genebuild_last_geneset_update <chr>

# Get details about the Mouse and the Fruit Fly genomes
get_assemblies(c('mus_musculus', 'drosophila_melanogaster'))
#> # A tibble: 2 × 10
#>   species_name   assembly_name assembly_date genebuild_method golden_path_length
#>   <chr>          <chr>         <chr>         <chr>                         <dbl>
#> 1 mus_musculus   GRCm39        2020-06       full_genebuild           2728222451
#> 2 drosophila_me… BDGP6.46      NA            import                    143726002
#> # ℹ 5 more variables: default_coord_system_version <chr>,
#> #   assembly_accession <chr>, genebuild_start_date <chr>,
#> #   genebuild_last_geneset_update <chr>, genebuild_initial_release_date <chr>