Get cytogenetic bands by species — get_cytogenetic

This function retrieves cytogenetic bands. If no cytogenetic information is available for the queried species then it will be omitted from in the returned value.

Usage

get_cytogenetic_bands(
  species_name = "homo_sapiens",
  verbose = FALSE,
  warnings = TRUE,
  progress_bar = TRUE
)

Arguments

species_name: The species name, i.e., the scientific name, all letters lowercase and space replaced by underscore. Examples: 'homo_sapiens' (human), 'ovis_aries' (Domestic sheep) or 'capra_hircus' (Goat).
verbose: Whether to be chatty.
warnings: Whether to print warnings.
progress_bar: Whether to show a progress bar.

Value

A tibble, each row being a cytogenetic band, of 8 variables:

species_name: Ensembl species name: this is the name used internally by Ensembl to uniquely identify a species by name. It is the scientific name but formatted without capitalisation and spacing converted with an underscore, e.g., 'homo_sapiens'.
assembly_name: Assembly name.
cytogenetic_band: Name of the cytogenetic_band.
chromosome: Chromosome name.
start: Genomic start position of the cytogenetic band. Starts at 1.
end: Genomic end position of the cytogenetic band. End position is included in the band interval.
stain: Giemsa stain results: Giemsa negative, 'gneg'; Giemsa positive, of increasing intensities, 'gpos25', 'gpos50', 'gpos75', and 'gpos100'; centromeric region, 'acen'; heterochromatin, either pericentric or telomeric, 'gvar'; and short arm of acrocentric chromosomes are coded as 'stalk'.
strand: Strand.

Examples

# Get toplevel sequences for the human genome (default)
get_cytogenetic_bands()
#> # A tibble: 862 × 8
#>    species_name assembly_name cytogenetic_band chromosome    start     end stain
#>    <chr>        <chr>         <chr>            <chr>         <int>   <int> <chr>
#>  1 homo_sapiens GRCh38        p11.1            Y          10300001  1.04e7 acen 
#>  2 homo_sapiens GRCh38        p11.2            Y            600001  1.03e7 gneg 
#>  3 homo_sapiens GRCh38        p11.31           Y            300001  6   e5 gpos…
#>  4 homo_sapiens GRCh38        p11.32           Y                 1  3   e5 gneg 
#>  5 homo_sapiens GRCh38        q11.1            Y          10400001  1.06e7 acen 
#>  6 homo_sapiens GRCh38        q11.21           Y          10600001  1.24e7 gneg 
#>  7 homo_sapiens GRCh38        q11.221          Y          12400001  1.71e7 gpos…
#>  8 homo_sapiens GRCh38        q11.222          Y          17100001  1.96e7 gneg 
#>  9 homo_sapiens GRCh38        q11.223          Y          19600001  2.38e7 gpos…
#> 10 homo_sapiens GRCh38        q11.23           Y          23800001  2.66e7 gneg 
#> # ℹ 852 more rows
#> # ℹ 1 more variable: strand <int>

# Get toplevel sequences for Mus musculus
get_cytogenetic_bands('mus_musculus')
#> # A tibble: 82 × 8
#>    species_name assembly_name cytogenetic_band chromosome   start      end stain
#>    <chr>        <chr>         <chr>            <chr>        <int>    <int> <chr>
#>  1 mus_musculus GRCm39        ""               MT               1   1.63e4 gneg 
#>  2 mus_musculus GRCm39        ""               Y                1   9.15e7 gneg 
#>  3 mus_musculus GRCm39        "cen"            2           110001   1.56e6 acen 
#>  4 mus_musculus GRCm39        "cen"            2          1555001   3   e6 acen 
#>  5 mus_musculus GRCm39        "p"              2                1   1.10e5 gneg 
#>  6 mus_musculus GRCm39        "q"              2          3000001   1.82e8 gneg 
#>  7 mus_musculus GRCm39        "cen"            4           110001   1.56e6 acen 
#>  8 mus_musculus GRCm39        "cen"            4          1555001   3   e6 acen 
#>  9 mus_musculus GRCm39        "p"              4                1   1.10e5 gneg 
#> 10 mus_musculus GRCm39        "q"              4          3000001   1.57e8 gneg 
#> # ℹ 72 more rows
#> # ℹ 1 more variable: strand <int>