nature genetics
    A user's guide to the human genome

Return to TOC
Previous Article AbstractFULL TEXTNext Article Abstract
Full Text PDF

volume 32 supplement pp 70 - 73

Question 13
A user has identified an interesting phenotype in a mouse model and has been able to narrow down the critical region for the responsible gene to approximately 0.5 cM. How does one find the mouse genes in this region?

Ensembl provides a mouse genome browser, similar to the one available for humans. It is being updated with the latest mouse genome sequence assemblies and, at the time of writing, displays the MGSC version 3 assembly of the mouse genome, with sequence data from February 2002. The sequence is estimated to cover 96% of mouse euchromatic DNA, and Ensembl has predicted that it contains over 22,000 genes. Start at the Ensembl mouse home page, at Choose Marker from the pull-down menu, type the marker name 'RH114718' in the adjacent box, and press Lookup. Click either of the resulting links to view more details about this radiation hybrid marker. RH114718 has been mapped to a single position on chromosome 19 and is also known as MGI:102447, MTH1904 and D19MIT109 (Fig. 13.1). Click on the chromosomal position to view the genomic context of the marker (Fig. 13.2).

The Overview section of Fig. 13.2 shows a region of 1 Mb of chromosome 19 centered around the marker, labeled D19MIT109 in this view. More than 30 mouse genes are predicted in this region, some already known and some new. The Detailed View at the bottom of the page is a zoomed-in display of the region around the marker. To get a better view of the genes and transcripts in this region, zoom out on the bottom view by clicking on the longest bar in the zoom control (closest to the minus sign). The Detailed View will now show the same region of chromosome 19 as the overview, but with many additional features (Fig. 13.3). The splice patterns of the genes and gene predictions are shown, as are regions of homology between the genome and other proteins and mRNAs. Pointing the computer mouse at any feature allows the user to open a small menu that links to additional descriptions.

Consider the new gene indicated by the red arrow in Fig. 13.3. To view general information about this gene, hold the computer mouse over the gene graphic and select Transcript Information from the pop-up menu. The GeneView window (Fig. 13.4) provides a description of this gene, as well as a link to the GeneView window for the putative human ortholog (Fig. 13.4, Homology Matches section). To view the database sequences that align with the predicted exons of the new mouse gene, place the computer mouse pointer over the gene in the Detailed View (Fig. 13.3, arrow) and select Supporting evidence from the pop-up menu. Fig. 13.5 depicts the mRNA and protein sequences that align with exons in the new gene. Click on any of the green boxes to see the alignment of the database sequence with the new transcript.

The zoomed-out Detailed View also provides links to computed regions of orthology between the mouse and human genomes (Fig. 13.3, pink bars). As the mouse genome assembly and annotation lag behind those of the human, it may also be useful to view the human genes in an orthologous region of the genome.

UCSC also provides a mouse genome browser and the BLAT search tool for use with the latest mouse genome sequence assemblies. The links are available from the UCSC genome browser home page, at Mouse genome analysis tools developed at the NCBI, including a mouse Map Viewer and mouse BLAST pages, are available from

  1. International Human Genome Sequencing Consortium. Initial sequencing and analysis of the human genome. Nature 409, 860-921 (2001). | Article | PubMed |
  2. Collins, F.S. and McKusick, V.A. Implications of the Human Genome Project for medical science. J. Am. Med. Assoc. 285, 540-544 (2001).
  3. Watson, J.D. & Crick, F.H.C. Molecular structure of nucleic acids: a structure for deoxyribose nucleic acid. Nature 171, 737-738 (1953).
  4. Green, E.D. Strategies for the systematic sequencing of complex genomes. Nature Rev. Genet. 2, 573-583 (2001). | Article | PubMed |
  5. Ouellette, B.F.F. & Boguski, M.S. Database divisions and homology search files: a guide for the perplexed. Genome Res. 7, 952-955 (1997). | PubMed |
  6. Bairoch, A. & Apweiler, R. The SWISS-PROT Protein Sequence Database and its supplement TREMBL in 2000. Nucleic Acids Res. 28, 45-48 (2000). | Article | PubMed |
  7. Hubbard, T. et al. The Ensembl Genome Database Project. Nucleic Acids Res. 30, 38-41 (2002). | Article | PubMed |
  8. Kent, W.J. BLAT--the BLAST-like Alignment Tool. Genome Res. 12, 656-664 (2002). | Article | PubMed |
  9. Stein, L. Genome annotation: from sequence to biology. Nature Rev. Genet. 2, 493-503 (2001). | Article | PubMed |
  10. Pruitt, K.D. & Maglott, D.R. RefSeq and LocusLink: NCBI gene-centered resources. Nucleic Acids Res. 29, 137-140 (2001). | Article | PubMed |
  11. Burge, C.B. & Karlin, S. Finding the genes in genomic DNA. Curr. Opin. Struct. Biol. 8, 346-354 (1998). | Article | PubMed |
  12. Schuler, G.D. Electronic PCR: bridging the gap between genome mapping and genome sequencing. Trends Biotechnol. 16, 456-459 (1998). | Article | PubMed |
  13. Sherry, S.T. et al. dbSNP: the NCBI database of genetic variation. Nucleic Acids Res. 29, 308-311 (2001). | Article | PubMed |
  14. Hamosh, A. et al. Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders. Nucleic Acids Res. 30, 52-55 (2002). | Article | PubMed |
  15. Baxevanis, A.D. & Ouellette, B.F.F. (eds.) Bioinformatics: A Practical Guide to the Analysis of Genes and Proteins (John Wiley & Sons, New York, 2001).
  16. Solovyev, V.V., Salamov, A.A. & Lawrence, C.B. Identification of human gene structure using linear discriminant functions and dynamic programming. Proc. Int. Conf. Intell. Syst. Mol. Biol. 3, 367-375 (1995). | PubMed |
  17. Yeh, R.F., Lim, L.P. & Burge, C.B. Computational inference of homologous gene structures in the human genome. Genome Res. 11, 803-816 (2001). | Article | PubMed |
  18. Marchler-Bauer, A. et al. CDD: a database of conserved domain alignments with links to domain three-dimensional structure. Nucleic Acids Res. 30, 281-283 (2002). | Article | PubMed |
  19. Apweiler, R. et al. InterPro--an integrated documentation resource for protein families, domains and functional sites. Bioinformatics 16, 1145-1150 (2000). | Article | PubMed |
  20. Rebhan, M., Chalifa-Caspi, V., Prilusky, J. & Lancet, D. GeneCards: a novel functional genomics compendium with automated data mining and query reformulation support. Bioinformatics 14, 656-664 (1998). | Article | PubMed |
  21. Blake, J.A., Richardson, J.E., Bult, C.J., Kadin, J.A. & Eppig, J.T. The Mouse Genome Database (MGD): the model organism database for the laboratory mouse. Nucleic Acids Res. 30, 113-115 (2002). | Article | PubMed |
  22. Hudson, T.J. et al. A radiation hybrid map of mouse genes. Nature Genet. 29, 201-205 (2001). | Article | PubMed |
  23. Bateman, A. et al. The Pfam protein families database. Nucleic Acids Res. 30, 276-280 (2002). | Article | PubMed |
  24. Letunic, I. et al. Recent improvements to the SMART domain-based sequence annotation resource. Nucleic Acids Res. 30, 242-244 (2002). | Article | PubMed |
  25. Altschul, S.F. et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389-3402 (1997). | Article | PubMed |
  26. Durbin, R., Eddy, S., Krogh, A. & Mitchison, G. Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids (Cambridge Univ. Press, Cambridge, 1998).
  27. Peri, S., Ibarrola, N., Blagoev, B., Mann, M. & Pandey, A. Common pitfalls in bioinformatics-based analyses: look before you leap. Trends Genet. 17, 541-545 (2001) [erratum Trends Genet. 18, 218 (2002)]. | Article | PubMed |
  28. Ponting, C. Issues in predicting protein function from sequence. Brief. Bioinform. 2, 19-29 (2001). | PubMed |
  29. Aparicio, S.A.J.R. How to count ... human genes. Nature Genet. 25, 129-130 (2000). | Article | PubMed |
  30. Beadle, G.W. & Tatum, E.L. Genetic control of biochemical reactions in Neurospora. Proc. Natl Acad. Sci. USA 27, 499-506 (1941).
  31. Jeffery, C.J., Bahnson, B.J., Chien, W., Ringe, D. & Petsko, G.A. Crystal structure of rabbit phosphoglucose isomerase, a glycolytic enzyme that moonlights as neuroleukin, autocrine motility factor, and differentiation mediator. Biochemistry 39, 955-964 (2000). | Article | PubMed |
  32. Wistow, G. & Piatigorsky, J. Recruitment of enzymes as lens structural proteins. Science 236, 1554-1556 (1987). | PubMed |
  33. Jeffery, C.J. Moonlighting proteins. Trends Biochem. Sci. 24, 8-11 (1999). | Article | PubMed |
  34. Chothia, C. Proteins. One thousand families for the molecular biologist. Nature 357, 543-544 (1992). | PubMed |
  35. Hegyi, H. & Gerstein, M. The relationship between protein structure and function: a comprehensive survey with application to the yeast genome. J. Mol. Biol. 288, 147-164 (1999). | Article | PubMed |
  36. Jansen, R. & Gerstein, M. Analysis of the yeast transcriptome with structural and functional categories: characterizing highly expressed proteins. Nucleic Acids Res. 28, 1481-1488 (2000). | Article | PubMed |
  37. Brenner, S.E. Errors in genome annotation. Trends Genet. 15, 132-133 (1999). | Article | PubMed |
  38. Smith, R.F. Perspectives: sequence data base searching in the era of large-scale genomic sequencing. Genome Res. 6, 653-660 (1996). | PubMed |

Copyright 2002 Nature Publishing