This function uses a k-means algorithm to heuristically select
suitable starting values for the general model.
Usage
choose.theta(u, m, ...)
Arguments
u
A matrix of (estimates of) realizations from the GMCM.
m
The number of components to be fitted.
...
Arguments passed to kmeans.
Details
The function selects the centers from the k-means algorithm as an initial
estimate of the means. The proportional sizes of the clusters are selected
as the initial values of the mixture proportions. The within cluster
standard deviations are used as the variance of the clusters. The
correlations between each dimension are taken to be zero.
Value
A list of parameters for the GMCM model on the form described in
rtheta.
Note
The function uses the kmeans function from the
stats-package.
Author(s)
Anders Ellern Bilgrau <anders.ellern.bilgrau@gmail.com>
Examples
set.seed(2)
# Simulating data
data1 <- SimulateGMCMData(n = 10000, m = 3, d = 2)
obs.data <- Uhat(data1$u) # The ranked observed data
# Using choose.theta to get starting estimates
theta <- choose.theta(u = obs.data, m = 3)
print(theta)
# To illustrate theta, we simulate from the model
data2 <- SimulateGMMData(n = 10000, theta = theta)
cols <- apply(get.prob(obs.data,theta),1,which.max)
# Plotting
par(mfrow = c(1,3))
plot(data1$z, main = "True latent GMM")
plot(Uhat(data1$u), col = cols,
main = "Observed GMCM\nColoured by k-means clustering")
plot(data2$z, main = "initial GMM")
Results
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(GMCM)
> png(filename="/home/ddbj/snapshot/RGM3/R_CC/result/GMCM/choose.theta.Rd_%03d_medium.png", width=480, height=480)
> ### Name: choose.theta
> ### Title: Heuristically chosen starting value of theta
> ### Aliases: choose.theta
>
> ### ** Examples
>
> set.seed(2)
>
> # Simulating data
> data1 <- SimulateGMCMData(n = 10000, m = 3, d = 2)
> obs.data <- Uhat(data1$u) # The ranked observed data
>
> # Using choose.theta to get starting estimates
> theta <- choose.theta(u = obs.data, m = 3)
> print(theta)
$m
[1] 3
$d
[1] 2
$pie
pie1 pie2 pie3
0.4690 0.4002 0.1308
$mu
$mu$comp1
[1] 0 0
$mu$comp2
[1] -10.527002 -9.741925
$mu$comp3
[1] -13.372135 6.747667
$sigma
$sigma$comp1
[,1] [,2]
[1,] 1 0.000000
[2,] 0 1.004652
$sigma$comp2
[,1] [,2]
[1,] 1.165814 0.0000000
[2,] 0.000000 0.9295826
$sigma$comp3
[,1] [,2]
[1,] 1.892084 0.0000000
[2,] 0.000000 0.5284667
>
> # To illustrate theta, we simulate from the model
> data2 <- SimulateGMMData(n = 10000, theta = theta)
>
> cols <- apply(get.prob(obs.data,theta),1,which.max)
>
> # Plotting
> par(mfrow = c(1,3))
> plot(data1$z, main = "True latent GMM")
> plot(Uhat(data1$u), col = cols,
+ main = "Observed GMCM\nColoured by k-means clustering")
> plot(data2$z, main = "initial GMM")
>
>
>
>
>
> dev.off()
null device
1
>