Download: gtom.R
Usage: gtom = GTOMmdist1(adjmat1, m)
<adjmat1> is a symmetric binary adjacency matrix encoding an undirected unweighted network
<m> is a positive integer indicating the order of the GTOM
¡@
To obtain R (a free statistical computing software), click here.
We use a yeast cell cycle data set published in:
Spellman et al., (1998). Comprehensive Identification of Cell Cycle-regulated Genes of the Yeast Saccharomyces cerevisiae by Microarray Hybridization. Molecular Biology of the Cell 9, 3273-3297.
The original data set by Spellman et al. (1998) : combine.txt (tab-delimited)
The data set we preprocessed1 (4000 genes, 44 conditions): YEASTCellCycle4000.csv (comma-separated)
The corresponding gene information file: YEASTCellCycle4000_geneinfo.csv (comma-separated)
¡@
1 First, arrays with more than 5% missing values are removed. Then the 4000 most varying genes (measured by standard deviation) are selected.
# Read in the expression levels
datExpr = read.csv("YEASTCellCycle4000.csv", header=T, row.names=1)
datExpr = t(datExpr)
# Read in the gene information
datInfo = read.csv("YEASTCellCycle4000_geneinfo.csv", header=T, row.names=1)
# Correlation matrix
cormat = cor(datExpr,use="p")
# Adjacency matrix
tau = 0.7
AdjMat1 <- as.matrix(abs(cormat)>=tau)
diag(AdjMat1)=0
# Restrict our attention to 1000
highly connected genes
Degree = apply(AdjMat1,1,sum)
DegreeRank = rank(-Degree)
keep.node =
DegreeRank <= 1000
AdjMat1.1000 = AdjMat1[keep.node,
keep.node]
datExpr.1000 = datExpr[, keep.node]
datInfo.1000 = datInfo[keep.node, ]
# Compute the GTOMm dissimilarity
(m=2 in this example)
dissGTOM2 = GTOMmdist1(AdjMat1.1000,
m=2)
# Hierarchical Clustering
hierGTOM2 = hclust(as.dist(dissGTOM2),method="average")
plot(hierGTOM2, main="GTOM2 Dissimilarity Measure", labels=F, xlab="", sub="")

# To define modules, we choose a height cutoff for the
dendrogram
module1 = cutree(hierGTOM2, h=0.5)
image(as.matrix(module1[hierGTOM2$order]), col=1:max(module1))

# Create a Topological Overlap Matrix plot
options(expressions = 10000)
module1_reverse = module1
module1_reverse[hierGTOM2$order[length(module1):1]] = module1[hierGTOM2$order]
heatmap(as.matrix(dissGTOM2), Rowv=as.dendrogram(hierGTOM2), Colv=
as.dendrogram(hierGTOM2), scale="none", revC=T, ColSideColors=as.character(module1),
RowSideColors=as.character(module1_reverse), labRow=F, labCol=F)

# Create a MDS plot (color by module assignment)
cmdGTOM2 = cmdscale(dissGTOM2,k=2)
plot(cmdGTOM2, col=as.character(module1), xlab="Dimension 1", ylab="Dimension
2")

# Color the MDS plot by gene essentiality (red =
essential)
attach(datInfo)
essentiality.1000 = essentiality[keep.node]
plot(cmdGTOM2, col=as.character(essentiality.1000+1), xlab="Dimension 1", ylab="Dimension
2")

# Define the GTOM-based connectivity measure
GTOM2.Conn = apply(1-dissGTOM2, 2, sum)
# Relate GTOM connectivity with gene essentiality using
Spearman correlation coefficient
cor(GTOM2.Conn, essentiality.1000, use="p", method="s")
0.1962606
2006-03-07
Please send your suggestions and
comments to: shorvath@mednet.ucla.edu