Weighted Gene Co-Expression Network Analysis Software
A systems biologic microarray analysis software for finding important genes and pathways.
WGCNA for Windows (Help file and sample files included, please follow the installation guide ). User manual is also available.
New! Version 1.1.1.4 (5/13/2008, updates log). We update the software almost weekly, so please keep both WGCNA and network library current.
WGCNA library function file: latest version 1.1.1.0, archive of older versions
Description
Weighted gene co-expression network
analysis (WGCNA) is a systems biologic method for analyzing microarray data,
gene information data, and microarray sample traits (e.g. case control status or
clinical outcomes). WGCNA can be used for constructing a weighted gene
co-expression network, for finding co-expression modules, and for calculating
module membership measures (related to intra-modular connectivity). WGCNA
facilitates a network based gene screening method that can be used to identify
candidate biomarkers or therapeutic targets. The gene screening method
integrates gene significance information (e.g. correlation between gene
expression and a clinical outcome) and module membership information to identify
biologically and statistically plausible genes. The software has a graphic
interface that facilitates straightforward input of microarray and clinical
trait data or pre-defined gene information. The software can analyze networks
comprised of tens of thousands of genes and implements several options for
automatic and manual gene selection ("network screening").
Background
WGCNA begins with the understanding that the information captured by microarray
experiments is far richer than a list of differentially expressed genes. Rather,
microarray data are more completely represented by considering the relationships
between measured transcripts, which can be assessed by pair-wise correlations
between gene expression profiles. In most microarray data analyses, however,
these relationships go essentially unexplored. WGCNA starts from the level of
thousands of genes, identifies clinically interesting gene modules, and finally
uses intramodular connectivity, gene significance (e.g. based on the correlation
of a gene expression profile with a sample trait) to identify key genes in the
disease pathways for further validation. WGCNA alleviates the multiple testing
problem inherent in microarray data analysis. Instead of relating thousands of
genes to a microarray sample trait, it focuses on the relationship between a few
(typically less than 10) modules and the sample trait. Toward this end, it
calculates the eigengene significance (correlation between sample trait and
eigengene) and the corresponding p-value for each module. The module definition
does not make use of a priori defined gene sets. Instead, modules are
constructed from the expression data by using hierarchical clustering. Although
it is advisable to relate the resulting modules to gene ontology information to
assess their biological plausibility, it is not required. Because the modules
may correspond to biological pathways, focusing the analysis on intramodular hub
genes (or the module eigengenes) amounts to a biologically motivated data
reduction scheme. Because the expression profiles of intramodular hub genes are
highly correlated, typically dozens of candidate biomarkers result. Although
these candidates are statistically equivalent, they may differ in terms of
biological plausibility or clinical utility. Gene ontology information can be
useful for further prioritizing intramodular hub genes. Examples of biological
studies that show the importance of intramodular hub genes can be found reported
in (Horvath et al 2006, Carlson et al 2006, Gargalovic et al 2006, Ghazalpour et
al 2006).
Snapshots: Step1. Load data, Step2.PreProcess, Step3. Network Construction, Step4. Module Detection, Step5. Gene Selection
To cite the WGCNA software, please use:
1. Horvath S, Zhang B, Carlson M, Lu KV, Zhu S, Felciano RM, Laurance MF, Zhao W, Shu, Q, Lee Y, Scheck AC, Liau LM, Wu H, Geschwind DH, Febbo PG, Kornblum HI, Cloughesy TF, Nelson SF, Mischel PS (2006) "Analysis of Oncogenic Signaling Networks in Glioblastoma Identifies ASPM as a Novel Molecular Target", PNAS | November 14, 2006 | vol. 103 | no. 46 | 17402-17407
2. Bin Zhang and Steve Horvath (2005) "A General Framework for Weighted Gene Co-Expression Network Analysis", Statistical Applications in Genetics and Molecular Biology: Vol. 4: No. 1, Article 17
3. Langfelder P, Zhang B, Horvath S (2007) Defining clusters from a hierarchical cluster tree: the Dynamic Tree Cut library for R. Bioinformatics. November/btm563
Additional material on WGCNA can be found here.
Bioinformatics Team Members
UCLA: Steve Horvath, Lin Wang, Wei Zhao, Peter Langfelder, Jun Dong, Tova Fuller, Mike Oldham, Paul Mischel, Stan Nelson, Jake Lusis, Tom Drake, Dan Geschwind, Jenny Papp, Anja Presson
Acknowledgement: Dan Salomon (Scripps), Sunil Kurian (Scripps), Pui-Yan Kwok (UCSF)
Supported by the Transplant Genomics Collaborative Group 1U19AI063603-01, NINDS/NIMH 1U24NS043562-01
Supported in parts from the UCLA Specialized Program of Research Excellence (SPORE) in Prostate Cancer (P50CA092131) and from the Jonsson Comprehensive Cancer Center, Core grant (5P30CA016042-28)
Contact Us
Please register with us if you plan to download any of the programs on this web page for software update. Email us with your name, the programs you plan to download and your affiliation. Contact us for suggestions and bug reports.
Downloading since 4/13/2007: