This function retrieves cross-references to external databases by Ensembl
identifier. The data is returned as a tibble
where each
row is a cross reference related to the provided Ensembl identifier. See
below under section Value for details about each column.
Usage
get_xrefs_by_ensembl_id(
species_name,
ensembl_id,
all_levels = FALSE,
ensembl_db = "core",
external_db = "",
feature = "",
verbose = FALSE,
warnings = TRUE,
progress_bar = TRUE
)
Arguments
- species_name
The species name, i.e., the scientific name, all letters lowercase and space replaced by underscore. Examples:
'homo_sapiens'
(human),'ovis_aries'
(Domestic sheep) or'capra_hircus'
(Goat).- ensembl_id
An Ensembl stable identifier, e.g.
"ENSG00000248234378"
.- all_levels
A
logical
vector. Set to find all genetic features linked to the stable ID, and fetch all external references for them. Specifying this on a gene will also return values from its transcripts and translations.- ensembl_db
Restrict the search to an Ensembl database: typically one of
'core'
,'rnaseq'
,'cdna'
,'funcgen'
and'otherfeatures'
.- external_db
External database to be filtered by. By default no filtering is applied.
- feature
Restrict search to a feature type: gene (
'gene'
), exon ('exon'
), transcript ('transcript'
), and protein ('translation'
).- verbose
Whether to be verbose about the http requests and respective responses' status.
- warnings
Whether to show warnings.
- progress_bar
Whether to show a progress bar.
Value
A tibble
of 12
variables:
species_name
Ensembl species name: this is the name used internally by Ensembl to uniquely identify a species by name. It is the scientific name but formatted without capitalisation and spacing converted with an underscore, e.g.,
'homo_sapiens'
.ensembl_id
An Ensembl stable identifier, e.g.
"ENSG00000248234378"
.ensembl_db
Ensembl database.
primary_id
Primary identification in external database.
display_id
Display identification in external database.
external_db_name
External database name.
external_db_display_name
External database display name.
version
TODO
info_type
There are two types of external cross references (XRef): direct (
'DIRECT'
) or dependent ('DEPENDENT'
). A direct cross reference is one that can be directly linked to a gene, transcript or translation object in Ensembl Genomes by synonymy or sequence similarity. A dependent cross reference is one that is transitively linked to the object via the direct cross reference. The value can also be'UNMAPPED'
for unmapped cross references, or'PROJECTION'
for TODO.info_text
TODO
synonyms
Other names or acronyms used to refer to the the external database entry. Note that this column is of the list type.
description
Brief description of the external database entry.
Ensembl REST API endpoints
get_xrefs_by_ensembl_id()
makes GET requests to
/xrefs/id/:id.
Examples
get_xrefs_by_ensembl_id('human', 'ENSG00000248378')
#> # A tibble: 1 × 12
#> species_name ensembl_id ensembl_db primary_id display_id external_db_name
#> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 human ENSG00000248378 core ENSG00000… ENSG00000… ArrayExpress
#> # ℹ 6 more variables: external_db_display_name <chr>, version <chr>,
#> # info_type <chr>, info_text <chr>, synonyms <list>, description <lgl>
get_xrefs_by_ensembl_id('human', 'ENSG00000248378', all_levels = TRUE)
#> # A tibble: 3 × 12
#> species_name ensembl_id ensembl_db primary_id display_id external_db_name
#> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 human ENSG00000248378 core ENSG00000… ENSG00000… ArrayExpress
#> 2 human ENSG00000248378 core uc063csj.1 uc063csj.1 UCSC
#> 3 human ENSG00000248378 core URS000007… URS000007… RNAcentral
#> # ℹ 6 more variables: external_db_display_name <chr>, version <chr>,
#> # info_type <chr>, info_text <chr>, synonyms <list>, description <lgl>