Generate random number from specified mixture models, including univariate and multivariate
Normal distribution, t-distribution, Skew Normal distribution, and Skew t-distribution.
A vector of how many points in each cluster,c(n1,n2,..,ng)
n
The total number of points
p
Dimension of data
g
The number of clusters
distr
A three letter string indicating the distribution type
pro
A vector of mixing proportions, see Details.
mu
A numeric matrix with each column corresponding to the mean, see Details.
sigma
An array of dimension (p,p,g) with first two dimension corresponding covariance matrix of each component, see Details.
dof
A vector of degrees of freedom for each component, see Details.
delta
A p by g matrix with each column corresponding to a skew parameter vector, see Details.
Details
The distribution type, determined by the distr parameter, which may take any one of the following values:
"mvn" for a multivariate normal, "mvt" for a multivariate t-distribution, "msn" for a multivariate skew normal distribution and "mst" for a multivariate skew t-distribution.
pro, a numeric vector of the mixing proportion of each component; mu, a p by g matrix with each column as its corresponding mean;
sigma, a three dimensional p by p by g array with its jth component matrix (p,p,j) as the covariance matrix for jth component of mixture models;
dof, a vector of degrees of freedom for each component; delta, a p by g matrix with its columns corresponding to skew parameter vectors.
Value
both rdemmix and rdemmix2 return an n by p numeric matrix of generated data;
rdemmix3 gives a list with components data, the generated data, and cluster, the clustering of data.
References
McLachlan G.J. and Krishnan T. (2008). The EM Algorithm and Extensions (2nd). New Jersay: Wiley.
McLachlan G.J. and Peel D. (2000). Finite Mixture Models. New York: Wiley.
See Also
rdmvn,rdmvt,rdmsn,
rdmst.
Examples
#specify the dimension of data, and number of clusters
#the number of observations in each cluster
n1=300;n2=300;n3=400;
nn<-c(n1,n2,n3)
p=2
g=3
#specify the distribution
distr <- "mvn"
#specify mean and covariance matrix for each component
sigma<-array(0,c(2,2,3))
for(h in 2:3) sigma[,,h]<-diag(2)
sigma[,,1]<-cbind( c(1,-0.1),c(-0.1,1))
mu <- cbind(c(4,-4),c(3.5,4),c( 0, 0))
#reset the random seed
set.seed(111)
#generate the dataset
dat <- rdemmix(nn,p,g,distr, mu,sigma)
# alternatively one can use
pro <- c(0.3,0.3,0.4)
n=1000
set.seed(111)
dat <- rdemmix2(n,p,g,distr,pro,mu,sigma)
plot(dat)
# and
set.seed(111)
dobj <- rdemmix3(n,p,g,distr,pro,mu,sigma)
plot(dobj$data)
#other distributions such as "mvt","msn", and "mst".
#t-distributions
dof <- c(3,5,5)
dat <- rdemmix2(n,p,g,"mvt",pro,mu,sigma,dof)
plot(dat)
#Skew Normal distribution
delta <- cbind(c(3,3),c(1,5),c(-3,1))
dat <- rdemmix2(n,p,g,"msn",pro,mu,sigma,delta=delta)
plot(dat)
#Skew t-distribution
dat <- rdemmix2(n,p,g,"mst",pro,mu,sigma,dof,delta)
plot(dat)
Results
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(EMMIXskew)
Loading required package: lattice
Loading required package: mvtnorm
Loading required package: KernSmooth
KernSmooth 2.23 loaded
Copyright M. P. Wand 1997-2009
> png(filename="/home/ddbj/snapshot/RGM3/R_CC/result/EMMIXskew/rdemmix.Rd_%03d_medium.png", width=480, height=480)
> ### Name: rdemmix
> ### Title: Simulate Data Using Mixture Models
> ### Aliases: rdemmix rdemmix2 rdemmix3
> ### Keywords: cluster datasets
>
> ### ** Examples
>
> #specify the dimension of data, and number of clusters
> #the number of observations in each cluster
> n1=300;n2=300;n3=400;
> nn<-c(n1,n2,n3)
>
> p=2
> g=3
>
>
>
> #specify the distribution
> distr <- "mvn"
>
> #specify mean and covariance matrix for each component
>
> sigma<-array(0,c(2,2,3))
> for(h in 2:3) sigma[,,h]<-diag(2)
> sigma[,,1]<-cbind( c(1,-0.1),c(-0.1,1))
> mu <- cbind(c(4,-4),c(3.5,4),c( 0, 0))
>
> #reset the random seed
> set.seed(111)
> #generate the dataset
> dat <- rdemmix(nn,p,g,distr, mu,sigma)
>
>
>
> # alternatively one can use
> pro <- c(0.3,0.3,0.4)
> n=1000
> set.seed(111)
> dat <- rdemmix2(n,p,g,distr,pro,mu,sigma)
> plot(dat)
>
> # and
>
> set.seed(111)
> dobj <- rdemmix3(n,p,g,distr,pro,mu,sigma)
> plot(dobj$data)
>
>
> #other distributions such as "mvt","msn", and "mst".
>
> #t-distributions
>
> dof <- c(3,5,5)
> dat <- rdemmix2(n,p,g,"mvt",pro,mu,sigma,dof)
> plot(dat)
>
> #Skew Normal distribution
> delta <- cbind(c(3,3),c(1,5),c(-3,1))
> dat <- rdemmix2(n,p,g,"msn",pro,mu,sigma,delta=delta)
> plot(dat)
>
>
> #Skew t-distribution
> dat <- rdemmix2(n,p,g,"mst",pro,mu,sigma,dof,delta)
> plot(dat)
>
>
>
>
>
>
>
>
> dev.off()
null device
1
>