Skip to contents

This function retrieves cytogenetic bands. If no cytogenetic information is available for the queried species then it will be omitted from in the returned value.

Usage

get_cytogenetic_bands(
  species_name = "homo_sapiens",
  verbose = FALSE,
  warnings = TRUE,
  progress_bar = TRUE
)

Arguments

species_name

The species name, i.e., the scientific name, all letters lowercase and space replaced by underscore. Examples: 'homo_sapiens' (human), 'ovis_aries' (Domestic sheep) or 'capra_hircus' (Goat).

verbose

Whether to be chatty.

warnings

Whether to print warnings.

progress_bar

Whether to show a progress bar.

Value

A tibble, each row being a cytogenetic band, of 8 variables:

species_name

Ensembl species name: this is the name used internally by Ensembl to uniquely identify a species by name. It is the scientific name but formatted without capitalisation and spacing converted with an underscore, e.g., 'homo_sapiens'.

assembly_name

Assembly name.

cytogenetic_band

Name of the cytogenetic_band.

chromosome

Chromosome name.

start

Genomic start position of the cytogenetic band. Starts at 1.

end

Genomic end position of the cytogenetic band. End position is included in the band interval.

stain

Giemsa stain results: Giemsa negative, 'gneg'; Giemsa positive, of increasing intensities, 'gpos25', 'gpos50', 'gpos75', and 'gpos100'; centromeric region, 'acen'; heterochromatin, either pericentric or telomeric, 'gvar'; and short arm of acrocentric chromosomes are coded as 'stalk'.

strand

Strand.

Examples

# Get toplevel sequences for the human genome (default)
get_cytogenetic_bands()
#> # A tibble: 862 × 8
#>    species_name assembly_name cytogenetic_band chromosome    start     end stain
#>    <chr>        <chr>         <chr>            <chr>         <int>   <int> <chr>
#>  1 homo_sapiens GRCh38        p11.1            Y          10300001  1.04e7 acen 
#>  2 homo_sapiens GRCh38        p11.2            Y            600001  1.03e7 gneg 
#>  3 homo_sapiens GRCh38        p11.31           Y            300001  6   e5 gpos…
#>  4 homo_sapiens GRCh38        p11.32           Y                 1  3   e5 gneg 
#>  5 homo_sapiens GRCh38        q11.1            Y          10400001  1.06e7 acen 
#>  6 homo_sapiens GRCh38        q11.21           Y          10600001  1.24e7 gneg 
#>  7 homo_sapiens GRCh38        q11.221          Y          12400001  1.71e7 gpos…
#>  8 homo_sapiens GRCh38        q11.222          Y          17100001  1.96e7 gneg 
#>  9 homo_sapiens GRCh38        q11.223          Y          19600001  2.38e7 gpos…
#> 10 homo_sapiens GRCh38        q11.23           Y          23800001  2.66e7 gneg 
#> # ℹ 852 more rows
#> # ℹ 1 more variable: strand <int>

# Get toplevel sequences for Mus musculus
get_cytogenetic_bands('mus_musculus')
#> # A tibble: 82 × 8
#>    species_name assembly_name cytogenetic_band chromosome   start      end stain
#>    <chr>        <chr>         <chr>            <chr>        <int>    <int> <chr>
#>  1 mus_musculus GRCm39        ""               MT               1   1.63e4 gneg 
#>  2 mus_musculus GRCm39        ""               Y                1   9.15e7 gneg 
#>  3 mus_musculus GRCm39        "cen"            2           110001   1.56e6 acen 
#>  4 mus_musculus GRCm39        "cen"            2          1555001   3   e6 acen 
#>  5 mus_musculus GRCm39        "p"              2                1   1.10e5 gneg 
#>  6 mus_musculus GRCm39        "q"              2          3000001   1.82e8 gneg 
#>  7 mus_musculus GRCm39        "cen"            4           110001   1.56e6 acen 
#>  8 mus_musculus GRCm39        "cen"            4          1555001   3   e6 acen 
#>  9 mus_musculus GRCm39        "p"              4                1   1.10e5 gneg 
#> 10 mus_musculus GRCm39        "q"              4          3000001   1.57e8 gneg 
#> # ℹ 72 more rows
#> # ℹ 1 more variable: strand <int>