Combining many interaction networks to predict gene function and analyze gene lists
Sara Mostafavi
Department of Computer Science, Stanford University, Stanford, CA, USA
Search for more papers by this authorCorresponding Author
Quaid Morris
Banting and Best Department of Medical Research, University of Toronto, Ontario, Canada
Departments of Molecular Genetics and Computer Science, University of Toronto, Ontario, Canada
The Donnelly Centre, University of Toronto, Ontario, Canada
Correspondence: Dr. Quaid Morris, Banting and Best Department of Medical Research, Univesity of Toronto, 160 College St., Rm 616, Toronto, Ontario, M5S3E1, Canda
E-mail:[email protected]
Search for more papers by this authorSara Mostafavi
Department of Computer Science, Stanford University, Stanford, CA, USA
Search for more papers by this authorCorresponding Author
Quaid Morris
Banting and Best Department of Medical Research, University of Toronto, Ontario, Canada
Departments of Molecular Genetics and Computer Science, University of Toronto, Ontario, Canada
The Donnelly Centre, University of Toronto, Ontario, Canada
Correspondence: Dr. Quaid Morris, Banting and Best Department of Medical Research, Univesity of Toronto, 160 College St., Rm 616, Toronto, Ontario, M5S3E1, Canda
E-mail:[email protected]
Search for more papers by this authorColour Online: See article online to view Figs. 1‒3 in colour.
Abstract
In this article, we review how interaction networks can be used alone or in combination in an automated fashion to provide insight into gene and protein function. We describe the concept of a “gene-recommender system” that can be applied to any large collection of interaction networks to make predictions about gene or protein function based on a query list of proteins that share a function of interest. We discuss these systems in general and focus on one specific system, GeneMANIA, that has unique features and uses different algorithms from the majority of other systems.
6 References
- 1Stelzl, U., Worm, U., Lalowski, M., Haenig, C. et al., A human protein-protein interaction network: a resource for annotating the proteome. Cell 2005, 122, 957–968.
- 2Giot, L., Bader, J. S., Brouwer, C., Chaudhuri, A. et al., A protein interaction map of Drosophila melanogaster. Science 2003, 302, 1727–1736.
- 3Suzuki, H., Fukunishi, Y., Kagawa, I., Saito, R. et al., Protein-protein interaction panel using mouse full-length cDNAs. Genome Res. 2001, 11, 1758–1765.
- 4Li, S., Armstrong, C. M., Bertin, N., Ge, H. et al., A map of the interactome network of the metazoan C. elegans. Science 2004, 303, 540–543.
- 5Ho, Y., Gruhler, A., Heilbut, A., Bader, G. D. et al., Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry. Nature 2002, 415, 180–183.
10.1038/415180a Google Scholar
- 6Krogan, N. J., Cagney, G., Yu, H., Zhong, G. et al., Global landscape of protein complexes in the yeast Saccharomyces cerevisiae. Nature 2006, 440, 637–643.
- 7Gavin, A. C., Aloy, P., Grandi, P., Krause, R. et al., Proteome survey reveals modularity of the yeast cell machinery. Nature 2006, 440, 631–636.
- 8Tong, A. H., Lesage, G., Bader, G. D., Ding, H. et al., Global mapping of the yeast genetic interaction network. Science 2004, 303, 808–813.
- 9Costanzo, M., Baryshnikova, A., Bellay, J., Kim, Y. et al., The genetic landscape of a cell. Science 2010, 327, 425–431.
- 10Lehner, B., Crombie, C., Tischler, J., Fortunato, A. et al., Systematic mapping of genetic interactions in Caenorhabditis elegans identifies common modifiers of diverse signaling pathways. Nat. Genet. 2006, 38, 896–903.
- 11Betel, D., Wilson, M., Gabow, A., Marks, D. S. et al., The microRNA.org resource: targets and expression. Nucleic Acids Res. 2008, 36, D149–D153.
- 12Easow, G., Teleman, A. A., Cohen, S. M., Isolation of microRNA targets by miRNP immunopurification. RNA 2007, 13, 1198–1204.
- 13Hsu, R. J., Tsai, H. J., Performing the labeled microRNA pull-down (LAMP) assay system: an experimental approach for high-throughput identification of microRNA-target mRNAs. Methods Mol. Biol. 2011, 764, 241–247.
- 14Birney, E., Stamatoyannopoulos, J. A., Dutta, A., Guigo, R. et al., Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature 2007, 447, 799–816.
- 15Hafner, M., Landthaler, M., Burger, L., Khorshid, M. et al., Transcriptome-wide identification of RNA-binding protein and microRNA target sites by PAR-CLIP. Cell 2010, 141, 129–141.
- 16Pellegrini, M., Marcotte, E. M., Thompson, M. J., Eisenberg, D. et al., Assigning protein functions by comparative genome analysis: protein phylogenetic profiles. Proc. Natl. Acad. Sci. USA 1999, 96, 4285–4288.
- 17Altschul, S. F., Madden, T. L., Schaffer, A. A., Zhang, J. et al., Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997, 25, 3389–3402.
- 18Hegyi, H., Gerstein, M., The relationship between protein structure and function: a comprehensive survey with application to the yeast genome. J. Mol. Biol. 1999, 288, 147–164.
- 19Brown, M. P., Grundy, W. N., Lin, D., Cristianini, N. et al., Knowledge-based analysis of microarray gene expression data by using support vector machines. Proc. Natl. Acad. Sci. USA 2000, 97, 262–267.
- 20Huh, W. K., Falvo, J. V., Gerke, L. C., Carroll, A. S.brk et al., Global analysis of protein localization in budding yeast. Nature 2003, 425, 686–691.
- 21Warde-Farley, D, Donaldson, S. L., Comes, O., Zuberi, K. et al., The GeneMANIA prediction server: biological network integration for gene prioritization and predicting gene function. Nucleic Acids Res. 2010, 38(Suppl), W214–W220.
- 22Troyanskaya, O. G., Dolinski, K., Owen, A. B., Altman, R. B. et al., A Bayesian framework for combining heterogeneous data sources for gene function prediction (in Saccharomyces cerevisiae). Proc. Natl. Acad. Sci. USA 2003, 100, 8348–8353.
- 23von Mering, C., Jensen, L. J., Snel, B., Hooper, S. D. et al., STRING: known and predicted protein-protein associations, integrated and transferred across organisms. Nucleic Acids Res. 2005, 33, D433–D437.
- 24Myers, C. L., Robson, D., Wible, A., Hibbs, M. A. et al., Discovery of biological networks from diverse functional genomic data. Genome Biol. 2005, 6, R114.
- 25Alexeyenko, A., Sonnhammer, E. L., Global networks of functional coupling in eukaryotes from comprehensive data integration. Genome Res. 2009, 19, 1107–1116.
- 26Montojo, J., Zuberi, K., Rodriguez, H., Kazi, F. et al., GeneMANIA Cytoscape plugin: fast gene function predictions on the desktop. Bioinformatics 2010, 26, 2927–2928.
- 27Mostafavi, S., Ray, D., Warde-Farley, D., Grouios, C. et al., GeneMANIA: a real-time multiple association network integration algorithm for predicting gene function. Genome Biol. 2008, 9(Suppl 1), S4.
- 28Szklarczyk, D., Franceschini, A., Kuhn, M., Simonovic, M. et al., The STRING database in 2011: functional interaction networks of proteins, globally integrated and scored. Nucleic Acids Res. 2011, 39, D561–D568.
- 29Myers, C. L., Chiriac, C., Troyanskaya, O. G., Discovering biological networks from diverse functional genomic data. Methods Mol. Biol. 2009, 563, 157–175.
- 30Huttenhower, C., Haley, E. M., Hibbs, M. A., Dumeaux, V. et al., Exploring the human genome with functional maps. Genome Res. 2009, 19, 1093–1106.
- 31Lee, I., Ambaru, B., Thakkar, P., Marcotte, E. M. et al., Rational association of genes with traits using a genome-scale gene network for Arabidopsis thaliana. Nat. Biotechnol. 2010, 28, 149–156.
- 32Ashburner, M., Ball, C. A., Blake, J. A., Botstein, D. et al., Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet. 2000, 25, 25–29.
- 33Kanehisa, M., Goto, S., KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 2000, 28, 27–30.
- 34Rogers, M. F., Ben-Hur, A., The use of gene ontology evidence codes in preventing classifier assessment bias. Bioinformatics 2009, 25, 1173–1177.
- 35Pena-Castillo, L., Tasan, M., Myers, C. L., Lee, H. et al., A critical assessment of Mus musculus gene function prediction using integrated genomic evidence. Genome Biol. 2008, 9(Suppl 1), S2.
- 36Fields, S., Song, O., A novel genetic system to detect protein-protein interactions. Nature 1989, 340, 245–246.
- 37Ito, T., Ota, K., Kubota, H., Yamaguchi, Y. et al., Roles for the two-hybrid system in exploration of the yeast protein interactome. Mol. Cell. Proteomics 2002, 1, 561–566.
- 38Zhang, W., Morris, Q. D., Chang, R., Shai, O. et al., The functional landscape of mouse gene expression. J. Biol. 2004, 3, 21.
- 39Su, A. I., Wiltshire, T., Batalov, S., Lapp, H. et al., A gene atlas of the mouse and human protein-encoding transcriptomes. Proc. Natl. Acad. Sci. USA 2004, 101, 6062–6067.
- 40Hunter, S., Apweiler, R., Attwood, T. K., Bairoch, A. et al., InterPro: the integrative protein signature database. Nucleic Acids Res. 2009, 37, D211–D215.
- 41Zhang, B., Horvath, S., A general framework for weighted gene co-expression network analysis. Stat. Appl. Genet. Mol. Biol. 2005, 4, Article17.
- 42Gillis, J., Pavlidis, P., The role of indirect connections in gene networks in predicting function. Bioinformatics 2011, 27, 1860–1866.
- 43Jansen, R., Yu, H., Greenbaum, D., Kluger, Y. et al., A Bayesian networks approach for predicting protein-protein interactions from genomic data. Science 2003, 302, 449–453.
- 44Hwang, S., Rhee, S. Y., Marcotte, E. M., Lee, I., Systematic prediction of gene function in Arabidopsis thaliana using a probabilistic functional gene network. Nat. Protoc. 2011, 6, 1429–1442.
- 45Myers, C. L., Troyanskaya, O. G., Context-sensitive data integration and prediction of biological networks. Bioinformatics. 2007, 23, 2322–2330.
- 46Gillis, J., Pavlidis, P., The impact of multifunctional genes on “guilt by association” analysis. PLoS ONE 2011, 6, e17258.
- 47Wang, P. I., Marcotte, E. M., It's the machine that matters: predicting gene function and phenotype from protein networks. J. Proteomics 2010, 73, 2277–2289.
- 48Mostafavi, S., Morris, Q., Fast integration of heterogeneous data sources for predicting gene function with limited annotation. Bioinformatics 2010.
- 49Zhou, D., Bousquet, O., Navin Lal, T., Weston, J. et al., Learning with local and global consistency. Neural Information Processing Systems, MIT Press, Vancouver, BC, Canada 2003.
- 50Zhu, X., Lafferty, J., Ghahramani, Z., Semi-supervised learning using Gaussian fields and harmonic functions. International Conference on Machine Learning. Washington, DC 2003.