WGCNA: an R package for weighted correlation network analysis
Peter Langfelder1 and Steve Horvath1,2
1 Dept. of Human Genetics, UC Los Ageles,
2 Dept. of Biostatistics, UC Los Ageles
Peter (dot) Langfelder (at) gmail (dot) com,
SHorvath (at) mednet (dot) ucla (dot) edu
BMC Bioinformatics, 2008 9:559
Link to paper (opens in a
new tab/window)
Abstract
Correlation networks are increasingly being used in
bioinformatics applications. For example,
weighted gene co-expression network analysis is a systems biology
method for describing the correlation patterns among genes
across microarray samples. Weighted correlation network analysis
(WGCNA) can be used for finding clusters (modules) of highly correlated
genes, for summarizing such clusters using the module
eigengene or an intramodular hub gene, for relating modules to one another and to external sample traits
(using eigengene network methodology), and for calculating module
membership measures. Correlation networks facilitate network based
gene screening methods that can be used to identify candidate
biomarkers or therapeutic targets. These methods have been
successfully applied in various biological contexts, e.g. cancer,
mouse genetics, yeast genetics, and analysis of brain imaging data. While parts of
the correlation network methodology have been described in separate publications, there is a
need to provide a user-friendly, comprehensive, and consistent
software implementation and an accompanying tutorial.
The WGCNA R software package is a comprehensive collection of R
functions for performing various aspects of weighted correlation
network analysis. The package includes functions for network
construction, module detection, gene selection, calculations of
topological properties, data simulation, visualization, and
interfacing with external software. Along with the R package we
also present R software tutorials. While the methods development
was motivated by gene expression data, the underlying data mining
approach can be applied to a variety of different settings.
Prerequisites
The WGCNA package requires the following packages to be installed: stats, fields, impute, grDevices,
dynamicTreeCut (1.20 or higher), qvalue, utils, and flashClust.
If your system does not have them installed, the easiest
way to install them is to issue the following command at the R prompt:
If you run an older version of R, the above may not install the flashClust package and the
newest version of the dynamicTreeCut
package. Should you encounter this problem, please manually download and install flashClust from this web page, and dynamicTreeCut from this web page.
R package download and installation
Download the package WGCNA_0.83 (last updated 2009/11/12):
The package version numbers follow the format
packageName_major.minor-revision. Minor versions typically add or change some functionality;
revisions typically contain bugfixes and small additions that do not require any changes in the code
using the functions.
Installation instructions
Short installation instructions, including other required and recommended packages,
are available here.
Should you discover bugs (of which there are most likely plenty), please report them to Peter Langfelder.
Problems installing or using the package
Please see our list of Frequently Asked Questions (and frequently given answers);
the solution to your problem may lie there. In particular, you can find answers about spurious Mac
errors, compatibility problems when upgrading WGCNA, and others.
If you still cannot solve the problem, email Peter
Langfelder.
Getting started with R and Weighted Gene Co-expression Network Analysis
The package described here is an add-on for the statistical language and environment R (free
software).
Our tutorial, described below, contains step by
step instructions such that even complete novice users should be able to get started in R immediately.
Lastly, readers wishing to learn about the theory and published applications of WGCNA are invited to
visit the WGCNA
main page.