Skip to contents

This function retrieves cross references by symbol or display name of a gene. The data is returned as a tibble where each row is a cross reference related to the provided symbol or display name of a gene. See below under section Value for details about each column.

Usage

get_xrefs_by_gene(
  species_name,
  gene,
  ensembl_db = "core",
  external_db = "",
  verbose = FALSE,
  warnings = TRUE,
  progress_bar = TRUE
)

Arguments

species_name

The species name, i.e., the scientific name, all letters lowercase and space replaced by underscore. Examples: 'homo_sapiens' (human), 'ovis_aries' (Domestic sheep) or 'capra_hircus' (Goat).

gene

Symbol or display name of a gene, e.g., 'ACTB' or 'BRCA2'.

ensembl_db

Restrict the search to a database other than the default. Ensembl's default database is 'core'.

external_db

Filter by external database, e.g. 'HGNC'. An empty string indicates no filtering.

verbose

Whether to be verbose about the http requests and respective responses' status.

warnings

Whether to show warnings.

progress_bar

Whether to show a progress bar.

Value

A tibble of 12 variables:

species_name

Ensembl species name: this is the name used internally by Ensembl to uniquely identify a species by name. It is the scientific name but formatted without capitalisation and spacing converted with an underscore, e.g., 'homo_sapiens'.

gene

Gene symbol.

ensembl_db

Ensembl database.

primary_id

Primary identification in external database.

display_id

Display identification in external database.

external_db_name

External database name.

external_db_display_name

External database display name.

version

TODO

info_type

There are two types of external cross references (XRef): direct ('DIRECT') or dependent ('DEPENDENT'). A direct cross reference is one that can be directly linked to a gene, transcript or translation object in Ensembl Genomes by synonymy or sequence similarity. A dependent cross reference is one that is transitively linked to the object via the direct cross reference. The value can also be 'UNMAPPED' for unmapped cross references, or 'PROJECTION' for TODO.

info_text

TODO

synonyms

Other names or acronyms used to refer to the gene. Note that this column is of the list type.

description

Brief description of the external database entry.

Ensembl REST API endpoints

get_xrefs_by_gene() makes GET requests to /xrefs/name/:species/:name.

Examples

# Get cross references that relate to gene BRCA2
get_xrefs_by_gene(species_name = 'human', gene = 'BRCA2')
#> # A tibble: 9 × 12
#>   species_name gene  ensembl_db primary_id display_id external_db_name
#>   <chr>        <chr> <chr>      <chr>      <chr>      <chr>           
#> 1 human        BRCA2 core       675        BRCA2      EntrezGene      
#> 2 human        BRCA2 core       1101       BRCA2      GeneCards       
#> 3 human        BRCA2 core       HGNC:1101  BRCA2      HGNC            
#> 4 human        BRCA2 core       A0A7P0TAP7 BRCA2      Uniprot_gn      
#> 5 human        BRCA2 core       A0A8V8TQQ4 BRCA2      Uniprot_gn      
#> 6 human        BRCA2 core       P51587     BRCA2      Uniprot_gn      
#> 7 human        BRCA2 core       A0A7P0T9D7 BRCA2      Uniprot_gn      
#> 8 human        BRCA2 core       675        BRCA2      WikiGene        
#> 9 human        BRCA2 core       IPR015525  BRCA2      Interpro        
#> # ℹ 6 more variables: external_db_display_name <chr>, version <chr>,
#> #   info_type <chr>, info_text <chr>, synonyms <list>, description <chr>