Skip to contents

The MGI report MGI_MRK_Coord.rpt lists mouse genetic markers along with their genomic annotations, such as chromosome position, genome assembly version, and the source of the annotation.

To read this report using the key "marker_coordinates", use the following code:

# To read all records (more than 600,000), use `read_report("marker_coordinates")`.
(marker_coord <- read_report(report_key = "marker_coordinates", n_max = 300L))
## # A tibble: 300 × 12
##    marker_type marker_id marker_symbol marker_name       feature_type chromosome
##    <fct>       <chr>     <chr>         <chr>             <fct>        <fct>     
##  1 Gene        MGI:87881 Acp1          acid phosphatase… protein cod… 12        
##  2 Gene        MGI:87853 a             nonagouti         protein cod… 2         
##  3 Gene        MGI:87926 Adh7          alcohol dehydrog… protein cod… 3         
##  4 Gene        MGI:87882 Acp2          acid phosphatase… protein cod… 2         
##  5 Gene        MGI:87929 Adh5          alcohol dehydrog… protein cod… 3         
##  6 Gene        MGI:87930 Adk           adenosine kinase  protein cod… 14        
##  7 Gene        MGI:87854 Pzp           PZP, alpha-2-mac… protein cod… 6         
##  8 Gene        MGI:87859 Abl1          c-abl oncogene 1… protein cod… 2         
##  9 Gene        MGI:87883 Acp5          acid phosphatase… protein cod… 9         
## 10 Gene        MGI:87884 Acr           acrosin preprope… protein cod… 15        
## # ℹ 290 more rows
## # ℹ 6 more variables: start <int>, end <int>, strand <fct>, assembly <fct>,
## #   source <fct>, database <fct>

For the first 300 records, here is the count by marker type:

dplyr::count(marker_coord, marker_type)
## # A tibble: 3 × 2
##   marker_type     n
##   <fct>       <int>
## 1 Gene          286
## 2 Pseudogene      9
## 3 QTL             5

By chromosome:

dplyr::count(marker_coord, chromosome)
## # A tibble: 20 × 2
##    chromosome     n
##    <fct>      <int>
##  1 1             25
##  2 2             28
##  3 3             26
##  4 4             20
##  5 5             16
##  6 6             13
##  7 7             20
##  8 8             16
##  9 9             26
## 10 10            13
## 11 11            15
## 12 12            11
## 13 13             6
## 14 14            11
## 15 15             8
## 16 16             4
## 17 17            12
## 18 18             9
## 19 19             9
## 20 X             12

By annotation source:

dplyr::count(marker_coord, source, database)
## # A tibble: 3 × 3
##   source  database               n
##   <fct>   <fct>              <int>
## 1 NCBI    NCBI Gene Model      158
## 2 Ensembl Ensembl Gene Model   137
## 3 MGI     QTL                    5

Variables

marker_type

marker_type: genetic marker type is a factor of 10 levels: Gene, GeneModel, Pseudogene, DNA Segment, Transgene, QTL, Cytogenetic Marker, BAC/YAC end, Complex/Cluster/Region, Other Genome Feature. See ?marker_type_definitions for the meaning of each type.

marker_id

marker_id: MGI accession identifier. A unique alphanumeric character string that is used to unambiguously identify a particular record in the Mouse Genome Informatics database. The format is MGI:nnnnnn, where n is a digit.

marker_symbol

marker_symbol: marker symbol is a unique abbreviation of the marker name.

marker_name

marker_name: marker name is a word or phrase that uniquely identifies the genetic marker, e.g. a gene or allele name.

feature_type

feature_type: an attribute of a portion of a genomic sequence. See the dataset ?feature_type_definitions for details.

chromosome

chromosome: mouse chromosome name. Possible values are names for the autosomal, sexual or mitochondrial chromosomes.

start

start: genomic start position (one-offset).

end

end: genomic end position (one-offset).

strand

strand: DNA strand, ‘+’ for sense, and ‘-’ for antisense.

assembly

assembly: mouse genome assembly version, a factor of two levels: 'GRCm38' and 'GRCm39'. Almost always 'GRCm39'.

source

source: provider of the genomic annotation.

database

database: database or catalogue within the source that provides the genomic annotation.