StARS Selection

If users want to fit a network over a series of regularization parameters instead of a single lambda as shown in last section, a numerical vector of regularization values should be given for the lambda.path parameter of the XMRF function.


Another option to study the Markov Networks over the complete regularization path is to let our XMRF method decide the path from a null model (empty network) to the full model (saturated network). In this case, the XMRF(...,method="",...) function will compute the maximum lambda that gives the null model and the minimum lambda that gives the full model for each of the parametric familes employed. The maximum lambda is computed based on the input data matrix, and is the maximum element from column-wise multiplication of data matrix (data matrix in n x p) normalized by the number of observations. Based on the maximum lambda value, the number of lambda (nlams) and the minimum lambda (lmin), sequence of appropriate lambda values will be computed.


Stability selection via StARS seeks to select the lambda value out of the regularization path which yields the most stable network (or, least variable to bootstrap perturbations). Specifically, the variability of each fitted network is measured based on the stability of edges inferred from the bootstrap samples. The network with the smallest penalization and variability below the user specified cutoff (beta) is selected as the final optimal network (Liu et al., 2010).


In the following example, we fit the XMRF(...,method="LPGM") to learn the same simulated scale-free network of 30 nodes from 200 observations along a path of 20 regularization parameters. Figure S6 shows the printed output of the the following code:

> library(XMRF)
> n = 200
> p = 30
   
# Simulate a scale-free network of 50 notes and 300 samples
> sim <- XMRF.Sim(n=n, p=p, model="LPGM", graph.type="scale-free")
> simDat <- sim$X
   
# Run LPGM on a whole regularization path
lpgm.fit <- XMRF(simDat, method="LPGM", nlams=20, stability="STAR", th=0.001)

Figure S6: Screen shot of the commands and output of fitting XMRF(...,method="LPGM") model over a path of 20 regularization parameters from R studio. The optimal network is the 10th of the 20 networks.
Image SuggFig3_LPGM_path_shot_v2



2015-05-29