Tutorial for building orthg database

Jiang junyao

2024-08-27

introduction

This tutorial is made for users who want to build customized orthg database.

In CACIMAR, we provide ‘buildHomDatabase’ function to build orthg database based Mouse Genome Informatics.

To run this function, users need to input MGI table and species name.

MGI table can be download from our Archive or MGI official website

For species names, this function currently only supports the following 10 species:

‘mm’ for mouse,

‘hs’ for human,

‘zf’ for zebrafish,

‘ch’ for chicken,

‘cf’ for dog,

‘pt’ for chimpanzee,

‘xt’ for frog,

‘rn’ for rat,

‘bt’ for cattle,

‘rh’ for macaque.

Example

library(CACIMAR)
MGI_table = read.delim2("F:/platform/OrthG database/HOM_AllOrganism.rpt", comment.char="#")
head(MGI_table)
##   HomoloGene.ID Common.Organism.Name NCBI.Taxon.ID Symbol EntrezGene.ID
## 1             3    mouse, laboratory         10090  Acadm         11364
## 2             3                human          9606  ACADM            34
## 3             3           chimpanzee          9598  ACADM        469356
## 4             3      macaque, rhesus          9544  ACADM        705168
## 5             3        dog, domestic          9615  ACADM        490207
## 6             3               cattle          9913  ACADM        505968
##   Mouse.MGI.ID HGNC.ID OMIM.Gene.ID Genetic.Location
## 1    MGI:87867                              Chr3  cM
## 2              HGNC:89  OMIM:607008       Chr1 p31.1
## 3                                               Chr1
## 4                                               Chr1
## 5                                               Chr6
## 6                                               Chr3
##   Genomic.Coordinates..mouse....human...
## 1                              Chr3:-(-)
## 2                              Chr1:-(+)
## 3                                       
## 4                                       
## 5                                       
## 6                                       
##                                          Name         Synonyms
## 1 acyl-Coenzyme A dehydrogenase, medium chain             MCAD
## 2         acyl-CoA dehydrogenase medium chain ACAD1|MCAD|MCADH
## 3         acyl-CoA dehydrogenase medium chain                 
## 4         acyl-CoA dehydrogenase medium chain                 
## 5         acyl-CoA dehydrogenase medium chain                 
## 6         acyl-CoA dehydrogenase medium chain
orthg_db = buildHomDatabase(MGI_table,'hs','rh')
head(orthg_db)
##                             Type hs_ID hs_Symbol  rh_ID rh_Symbol
## c..ACADM....34....1..  hs_rh_1T1    34     ACADM 705168     ACADM
## c..ACAT1....38....1..  hs_rh_1T1    38     ACAT1 707653     ACAT1
## c..ACVR1....90....1..  hs_rh_1T1    90     ACVR1 697935     ACVR1
## c..SGCA....6442....1.. hs_rh_1T1  6442      SGCA 704493      SGCA
## c..AGA....175....1..   hs_rh_1T1   175       AGA 699740       AGA
## c..AGT....183....1..   hs_rh_1T1   183       AGT 714203       AGT