The number of simulated data or replacements to be tried
x
The dataset, an n by p numeric matrix, where n is number of observations and p the dimension of data.
g
The number of components of the mixture model
g1,g2
The range of the number of components of the mixture model
distr
A three letter string indicating the type of distribution to be fit. See Details.
ncov
A small integer indicating the type of covariance structure. See Details.
popPAR
A list with components pro, a numeric vector of the mixing proportion of each component; mu, a p by g matrix with each column as its corresponding mean;
sigma, a three dimensional p by p by g array with its jth component matrix (p,p,j) as the covariance matrix for jth component of mixture models;
dof, a vector of degrees of freedom for each component; delta, a p by g matrix with its columns corresponding to skew parameter vectors.
replace
A logical value indicating whether replacement to be used
itmax
A big integer specifying the maximum number of iterations to apply
epsilon
A small number used to stop the EM algorithm loop when the relative difference between log-likelihood at each iteration become sufficient small.
Details
The distribution type, distr, is one of the following values:
"mvn" for a multivariate normal, "mvt" for a multivariate t-distribution, "msn" for a multivariate skew normal distribution and "mst" for a multivariate skew t-distribution.
The covariance matrix type, represented by the ncov parameter, may be any one of the following:
ncov=1 for a common variance, ncov=2 for a common diagonal variance, ncov=3 for a general variance, ncov =4 for a diagonal variance, ncov=5 for
sigma(h)*I(p)(diagonal covariance with same identical diagonal element values).
When replace is FALSE, parametric bootstrap is used; otherwise replacement method is used.
Value
bootstrap gives standard errors.
bootstrap.noc returns a list with components ret, a B by (g2-g1) matrix of -2log(Lambda),
vlk, the loglikehood for each g in the range of g1 to g2, and pvalue, the p-values of g vs g+1. The results of fitting mixture models are stored in curent working directory, which can be used via command in R: obj <- dget("ReturnOf_g_???.ret").
References
McLachlan G.J. and Krishnan T. (2008). The EM Algorithm and Extensions (2nd). New Jersay: Wiley.
McLachlan G.J. and Peel D. (2000). Finite Mixture Models. New York: Wiley.
See Also
EmSkew,rdemmix
Examples
n1=300;n2=300;n3=400;
nn <-c(n1,n2,n3)
n <- sum(nn)
p <- 2
g <- 3
sigma<-array(0,c(p,p,g))
for(h in 1:3) sigma[,,h]<-diag(p)
mu <- cbind(c(4,-4),c(3.5,4),c( 0, 0))
# for other distributions,
#delta <- cbind(c(3,3),c(1,5),c(-3,1))
#dof <- c(3,5,5)
distr="mvn"
ncov=3
#first we generate a data set
set.seed(111) #random seed is set
dat <- rdemmix(nn,p,g,distr,mu,sigma,dof=NULL,delta=NULL)
#start from initial partition
clust<- rep(1:g,nn)
obj <- EmSkewfit1(dat,g,clust,distr,ncov,itmax=1000,epsilon=1e-5)
# do bootstrap (stadard error analysis)
## Not run:
std <- bootstrap(dat,n,p,g,distr,ncov,obj,B=19,
replace=TRUE,itmax=1000,epsilon=1e-5)
print(std)
# do booststrap analysis of -2log(Lambda).
# alternatively data can be input as follow,
# dat <- read.table("mydata.txt",header=TRUE)
# p <- ncol(dat)
# n <- nrow(dat)
lad <- bootstrap.noc(dat,n,p,2,4,distr,ncov,B=19,
replace=FALSE,itmax=1000,epsilon=1e-5)
print(lad)
# return of g=2
obj2 <- dget("ReturnOf_g_2.ret")
# return of g=3
obj3 <- dget("ReturnOf_g_3.ret")
# return of g=4
obj4 <- dget("ReturnOf_g_4.ret")
#The posterior probability matrix for (g=3) is obtained by
tau <- obj3$tau
## End(Not run)
Results
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(EMMIXskew)
Loading required package: lattice
Loading required package: mvtnorm
Loading required package: KernSmooth
KernSmooth 2.23 loaded
Copyright M. P. Wand 1997-2009
> png(filename="/home/ddbj/snapshot/RGM3/R_CC/result/EMMIXskew/bootstrap.Rd_%03d_medium.png", width=480, height=480)
> ### Name: bootstrap
> ### Title: Bootstrap
> ### Aliases: bootstrap bootstrap.noc
> ### Keywords: cluster datasets
>
> ### ** Examples
>
>
> n1=300;n2=300;n3=400;
> nn <-c(n1,n2,n3)
> n <- sum(nn)
> p <- 2
> g <- 3
>
>
> sigma<-array(0,c(p,p,g))
> for(h in 1:3) sigma[,,h]<-diag(p)
>
> mu <- cbind(c(4,-4),c(3.5,4),c( 0, 0))
>
> # for other distributions,
> #delta <- cbind(c(3,3),c(1,5),c(-3,1))
> #dof <- c(3,5,5)
>
> distr="mvn"
> ncov=3
>
> #first we generate a data set
> set.seed(111) #random seed is set
> dat <- rdemmix(nn,p,g,distr,mu,sigma,dof=NULL,delta=NULL)
>
> #start from initial partition
> clust<- rep(1:g,nn)
> obj <- EmSkewfit1(dat,g,clust,distr,ncov,itmax=1000,epsilon=1e-5)
>
> # do bootstrap (stadard error analysis)
>
> ## Not run:
> ##D std <- bootstrap(dat,n,p,g,distr,ncov,obj,B=19,
> ##D replace=TRUE,itmax=1000,epsilon=1e-5)
> ##D print(std)
> ##D
> ##D # do booststrap analysis of -2log(Lambda).
> ##D
> ##D # alternatively data can be input as follow,
> ##D # dat <- read.table("mydata.txt",header=TRUE)
> ##D # p <- ncol(dat)
> ##D # n <- nrow(dat)
> ##D
> ##D lad <- bootstrap.noc(dat,n,p,2,4,distr,ncov,B=19,
> ##D replace=FALSE,itmax=1000,epsilon=1e-5)
> ##D print(lad)
> ##D
> ##D # return of g=2
> ##D obj2 <- dget("ReturnOf_g_2.ret")
> ##D
> ##D # return of g=3
> ##D obj3 <- dget("ReturnOf_g_3.ret")
> ##D
> ##D # return of g=4
> ##D obj4 <- dget("ReturnOf_g_4.ret")
> ##D
> ##D #The posterior probability matrix for (g=3) is obtained by
> ##D
> ##D tau <- obj3$tau
> ##D
> ##D
> ## End(Not run)
>
>
>
>
>
>
> dev.off()
null device
1
>