mRNA Microarray Data

This section will give an example of the work-flow of learning gene networks associated with kidney renal clear cell carcinoma (KIRC) from tumor patients:

1.
Obtain gene expression data for KIRC, profiled with mRNA microarray platform.
2.
Obtain data for only tumor samples.
3.
Filter genes so that the top 5% of variable genes remain.
4.
Use the XMRF function to learn the network structure. Note that it is always good practice to visualize the data to confirm the distributional family before model fitting. In this example as shown in Figure S9, the data follows a Gaussian distribution and thus fitting a Gaussian Graphical model is appropriate.
5.
Write the network in GML format and view the network via Cytoscape (Figure S10).


Code snippets for the above work-flow are provided as follows:

	> library(TCGA2STAT)
	
	> # Get TCGA data
	> # Obtain mRNA array gene expression data for KIRC patients 
	> kirc.dat <- getTCGA(dataset="KIRC", data.type="mRNA_Array")
	> kirc.tum <- SampleSplit(kirc.dat)$primary.tumor
	
	> # Filter genes to remain those of top 5% most variated genes
	> var <- apply(kirc.tum, 1, var)
	> nac <- apply(kirc.tum, 1, function(x) sum(is.na(x)))
	> kirc.tum.gd <- kirc.tum[var >= quantile(var, probs=0.95, na.rm=T) & !is.na(var) & nac==0, ]

	> # Take a look at the data to confirm distribution family
	> hist(kirc.tum.gd, breaks=20)
	
	> # Fit the data to Gaussian graphical model
	> kirc.tum.fit <- XMRF(kirc.tum.gd, method="GGM", N=100, stability="STAR", nlams=10, beta=0.001)

	> # Visualized the gene network
	> plotGML(kirc.tum.fit, fn="kirc.tum.array.gml", i=2, weight=TRUE, vars=rownames(kirc.tum.gd))

Figure S9: Distribution of mRNA expression profiled with micrarray from KIRC tumor samples.
Image SuppFig7_kirctumgdhist

Figure S10: KIRC expressed gene networks estimated by GGM via XMRF(...,method="GGM") for mRNA expression data.



2015-05-29