Andy M. Yip (1,2) and Steve Horvath (3,4)
1 Dept. of Mathematics, UCLA
2 Dept. of Mathematics, National University of Singapore
3 Dept. of Human Genetics, David Geffen School of Medicine, UCLA
4 Dept. of Biostatistics, School of Public Health, UCLA
shorvath@mednet.ucla.edu
http://www.ph.ucla.edu/biostat/people/horvath.htm
Department of Human Genetics and Department of Biostatistics
University of California, Los Angeles, CA 90095
Network methods are increasingly used to represent the interactions of genes and/or proteins. Genes or proteins that are directly linked may have a similar biological function or may be part of the same biological pathway. Since the information on the connection (adjacency) between 2 nodes may be noisy or incomplete, it can be desirable to consider alternative measures of pairwise interconnectedness. Here we study a class of measures that are proportional to the number of neighbors that a pair of nodes share in common. For example, the topological overlap measure by Ravasz et al. [1] can be interpreted as a measure of agreement between the m=1 step neighborhoods of 2 nodes. Several studies have shown that two proteins having a higher topological overlap are more likely to belong to the same functional class than proteins having a lower topological overlap. Here we address the question whether a measure of topological overlap based on higher-order neighborhoods could give rise to a more robust and sensitive measure of interconnectedness.
Results
We generalize the topological overlap measure from m=1 step neighborhoods to m>=2 step neighborhoods. This allows us to define the m-th order generalized topological overlap measure (GTOM) by (i) counting the number of m-step neighbors that a pair of nodes share and (ii) normalizing it to take a value between 0 and 1. Using theoretical arguments, a yeast co-expression network application, and a fly protein network application, we illustrate the usefulness of the proposed measure for module detection and gene neighborhood analysis.
Conclusions
Topological overlap can serve as an important filter to counter the effects of spurious or missing connections between network nodes. The m-th order topological overlap measure allows one to trade-off sensitivity versus specificity when it comes to defining pairwise interconnectedness and network modules.
Yip A, Horvath S (2007) Gene network interconnectedness and the generalized topological overlap measure BMC Bioinformatics 2007, 8:22
Tutorial in
Microsoft Word Format
Dataset
Tutorial in
Microsoft Word Format
Dataset
Data
Annotation File
PowerPoint version PDF version
Weighted Gene Co-Expression Network Page
The old webpage has been moved to here.
2007-01-27
Please send your suggestions and
comments to: shorvath@mednet.ucla.edu