R: Bayesian Network Discovery using a Standard MCMC Algorithm
Networks.STD
R Documentation
Bayesian Network Discovery using a Standard MCMC Algorithm
Description
This function implements the standard Markov chain Monte Carlo (MCMC) algorithm to perform feature selection and sub-network discovery using a Bayesian nonparametric model based on the Dirichlet process mixture (DPM) model and the Ising model.
a vector of p-values obtained from large scale statistical hypothesis testing
net
an "n" by "n" binary adjacent matrix (0/1) for the network configuration with n=length(pvalue)
iter
the number of iterations; the default is 5000
nburns
the number of burn-in; the default is 2000
piall
a vector of possible choices for "pi0" in an increasing order; the default value is c(0.75, 0.8, 0.85, 0.9)
rhoall
a vector of possible choices of "rho0" and "rho1" in an increasing order; the default value is c(0.5, 1, 5, 10, 15)
status
a logical variable indicating whether the you are initial an MCMC chain or resume a chain
fit
a list provide the current values of parameters which are used for resume the chain
show.steps
integer representing the frequency of the results of iterations presented, the default value is 1.The setting is invalid when trace=FALSE. The setting would not affect the data saved, only for printing
showlikelihood
a logical variable indicating whether to show the log-likelihood value simultaneously.Set TRUE if show the log-likelihood value simultaneously, FALSE, otherwise. FALSE is the default setting
likelihood.frequency
a number representing the frequency showing the log-likelihood value simultaneously. For example, setting likelihood.frequency=100 means showing the log-likelihood value every 100 iterations. The default setting is 100 and it is recommended that do not set a small frequency because it slow down the MCMC chain updating
Details
This function implements a Bayesian nonparametric mixture model for feature selection incorporating network information (Zhao et al., 2014):
r_i| g_i, theta ~ N(mu_g_i, sigma_g_i),
g_i | z_i=k, q_k ~ Discrete(a_k, q_k),
theta ~ G_0k, for g in a_k,
q_k ~ Dirichlet(tau_k 1_{L_k}/L_k),
theta={theta_g}_{g in a_0 and a_1}
theta_g=(mu_g, sigma_g)
where we define
Index
a_0=(-L_0+1,-L_0+2,...,0) , a_1=(1,2,...,L_1) and the correspondent probability q_0=(q_-L_0+1, q_-L_0+2, ...,q_0), q_1=(q_1, q_2, ..., q_L_1), according to the definition of Discrete(a_k, b_k), for example, Pr(g_i=L_0+2)=q_-L_0+2.
Assumption
In this algorithm, we assume that "important" features should have larger statistics comparing to "unimportant" ones without the loss of generality. In this regard, we set the restriction mu_g<mu_g+1 for g=-L_0+1, -L_0+2,...,L_1.
This function implements the NET-DPM-1 Zhao et al.(2014). Please refer to the Appendix B.1 for more details.
Value
An object of class "Networks.STD" containing Bayesian information of the MCMC chain
trace
a matrix of dimension (iter-nburns) by length(pvalue) containing posterior samples of classification label "g_i"
convergence
MCMC Heidelberger and Welch convergence diagnostic
graph
An igraph graph object of full network
model
a list containing parameters of the MCMC chain for resuming the chain
Author(s)
Zhou Lan, Jian Kang, Tianwei Yu and Yize Zhao
Department of Biostatistics and Bioinformatics, Emory University
References
Zhao, Y., Kang, J., Yu, T. A Bayesian nonparametric mixture model for selecting gene and gene-sub network, Annals of Applied Statistics, In press: 2014.
Zhou Lan, Jian Kang, Tianwei Yu, Yize Zhao, BANFF: an R package for network identifications via Bayesian nonparametric mixture models, working paper.
Examples
library(igraph)
###Creating the network of 10X10 image
g <- graph.lattice(length=10,dim=2)
net=as(get.adjacency(g,attr=NULL),"matrix")##this is the input of argument 'net'
##Assign the signal elements with signal intention
##as normal distribution N(1,0.2). While noise is set as N(0,0.2)
newz=rep(0,100)
for (i in 3:7)
{
newz[(i*10+3):(i*10+7)]=1
}
testcov<-0
for(i in 1:100){
if(newz[i]==0){
testcov[i]<-rnorm(1,mean=0,sd=0.2)
}else{
testcov[i]<-rnorm(1,mean=1,sd=0.2)
}
}
##The profile of the image
image(matrix(testcov,10,10),col=gray(seq(0,1,length=255)))
##Transform the signals into pvalue form and begin identification
pvalue=pnorm(-testcov)
total=Networks.STD(pvalue,net,iter=3,nburns=1,
piall=c(0.8, 0.85, 0.9, 0.95),rhoall=c(0.5,1,5,10,15))
Results
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(BANFF)
Loading required package: foreach
Loading required package: doParallel
Loading required package: iterators
Loading required package: parallel
> png(filename="/home/ddbj/snapshot/RGM3/R_CC/result/BANFF/Networks.STD.Rd_%03d_medium.png", width=480, height=480)
> ### Name: Networks.STD
> ### Title: Bayesian Network Discovery using a Standard MCMC Algorithm
> ### Aliases: Networks.STD
>
> ### ** Examples
>
> library(igraph)
Attaching package: 'igraph'
The following objects are masked from 'package:stats':
decompose, spectrum
The following object is masked from 'package:base':
union
> ###Creating the network of 10X10 image
> g <- graph.lattice(length=10,dim=2)
> net=as(get.adjacency(g,attr=NULL),"matrix")##this is the input of argument 'net'
> ##Assign the signal elements with signal intention
> ##as normal distribution N(1,0.2). While noise is set as N(0,0.2)
> newz=rep(0,100)
> for (i in 3:7)
+ {
+ newz[(i*10+3):(i*10+7)]=1
+ }
> testcov<-0
> for(i in 1:100){
+ if(newz[i]==0){
+ testcov[i]<-rnorm(1,mean=0,sd=0.2)
+
+ }else{
+ testcov[i]<-rnorm(1,mean=1,sd=0.2)
+
+ }
+ }
> ##The profile of the image
> image(matrix(testcov,10,10),col=gray(seq(0,1,length=255)))
> ##Transform the signals into pvalue form and begin identification
> pvalue=pnorm(-testcov)
> total=Networks.STD(pvalue,net,iter=3,nburns=1,
+ piall=c(0.8, 0.85, 0.9, 0.95),rhoall=c(0.5,1,5,10,15))
iteration: 1
iteration: 2
iteration: 3
>
>
>
>
>
> dev.off()
null device
1
>