Skip to contents

The MGI report MGI_MRK_Coord.rpt lists mouse genetic markers along with their genomic annotations, such as chromosome position, genome assembly version, and the source of the annotation.

To read this report using the key "marker_coordinates", use the following code:

# To read all records (more than 600,000), use `read_report("marker_coordinates")`.
(marker_coord <- read_report(report_key = "marker_coordinates", n_max = 300L))
## # A tibble: 300 × 12
##    marker_type       marker_id marker_symbol marker_name feature_type chromosome
##    <fct>             <chr>     <chr>         <chr>       <fct>        <fct>     
##  1 Other Genome Fea… MGI:7712… Rr517         regulatory… enhancer     17        
##  2 Gene              MGI:87854 Pzp           PZP, alpha… protein cod… 6         
##  3 Gene              MGI:87859 Abl1          c-abl onco… protein cod… 2         
##  4 Gene              MGI:87862 Scgb1b27      secretoglo… protein cod… 7         
##  5 Gene              MGI:87863 Scgb2b27      secretoglo… protein cod… 7         
##  6 Gene              MGI:87864 Scgb2b26      secretoglo… protein cod… 7         
##  7 Gene              MGI:87866 Acadl         acyl-Coenz… protein cod… 1         
##  8 Gene              MGI:87867 Acadm         acyl-Coenz… protein cod… 3         
##  9 Gene              MGI:87868 Acads         acyl-Coenz… protein cod… 5         
## 10 Gene              MGI:87870 Acat1         acetyl-Coe… protein cod… 9         
## # ℹ 290 more rows
## # ℹ 6 more variables: start <int>, end <int>, strand <fct>, assembly <fct>,
## #   source <fct>, database <fct>

For the first 300 records, here is the count by marker type:

dplyr::count(marker_coord, marker_type)
## # A tibble: 4 × 2
##   marker_type              n
##   <fct>                <int>
## 1 Gene                    36
## 2 Pseudogene               1
## 3 DNA Segment            261
## 4 Other Genome Feature     2

By chromosome:

dplyr::count(marker_coord, chromosome)
## # A tibble: 16 × 2
##    chromosome     n
##    <fct>      <int>
##  1 1              4
##  2 2              4
##  3 3              2
##  4 4              1
##  5 5              5
##  6 6              1
##  7 7              4
##  8 8              2
##  9 9              8
## 10 10            70
## 11 11           191
## 12 12             1
## 13 13             2
## 14 14             1
## 15 15             1
## 16 17             3

By annotation source:

dplyr::count(marker_coord, source, database)
## # A tibble: 5 × 3
##   source  database               n
##   <fct>   <fct>              <int>
## 1 NCBI    NCBI Gene Model       19
## 2 Ensembl Ensembl Gene Model    18
## 3 MGI     MGI                  261
## 4 MGI     NA                     1
## 5 VISTA   VISTA Gene Model       1

Variables

marker_type

marker_type: genetic marker type is a factor of 10 levels: Gene, GeneModel, Pseudogene, DNA Segment, Transgene, QTL, Cytogenetic Marker, BAC/YAC end, Complex/Cluster/Region, Other Genome Feature. See ?marker_type_definitions for the meaning of each type.

marker_id

marker_id: MGI accession identifier. A unique alphanumeric character string that is used to unambiguously identify a particular record in the Mouse Genome Informatics database. The format is MGI:nnnnnn, where n is a digit.

marker_symbol

marker_symbol: marker symbol is a unique abbreviation of the marker name.

marker_name

marker_name: marker name is a word or phrase that uniquely identifies the genetic marker, e.g. a gene or allele name.

feature_type

feature_type: an attribute of a portion of a genomic sequence. See the dataset ?feature_type_definitions for details.

chromosome

chromosome: mouse chromosome name. Possible values are names for the autosomal, sexual or mitochondrial chromosomes.

start

start: genomic start position (one-offset).

end

end: genomic end position (one-offset).

strand

strand: DNA strand, ‘+’ for sense, and ‘-’ for antisense.

assembly

assembly: mouse genome assembly version, a factor of two levels: 'GRCm38' and 'GRCm39'. Almost always 'GRCm39'.

source

source: provider of the genomic annotation.

database

database: database or catalogue within the source that provides the genomic annotation.