R: Bayesian Latent Class Analysis via an EM Algorithm and Using...
blca.boot
R Documentation
Bayesian Latent Class Analysis via an EM Algorithm and Using Empirical Bootstrapping
Description
Latent class analysis (LCA) attempts to find G hidden classes in binary data X. blca.boot repeatedly samples from X with replacement then utilises an EM algorithm to find maximum posterior (MAP) and standard error estimates of the parameters.
Usage
blca.boot(X, G, alpha = 1, beta = 1, delta = rep(1, G),
start.vals = c("single", "across"), counts.n = NULL,
fit = NULL, iter = 50, B = 100, relabel = FALSE,
verbose = TRUE, verbose.update = 10, small = 1e-100)
Arguments
X
The data matrix. This may take one of several forms, see data.blca.
G
The number of classes to run lca for.
alpha, beta
The prior values for the data conditional on group membership. These may take several forms: a single value, recycled across all groups and columns, a vector of length G or M (the number of columns in the data), or finally, a G \times M matrix specifying each prior value separately. Defaults to 1, i.e, a uniform prior, for each value.
delta
Prior values for the mixture components in model. Defaults to 1, i.e., a uniform prior. May be single or vector valued (of length G).
start.vals
Denotes how class membership is to be assigned during the initial step of the algorithm. Two character values may be chosen, "single", which randomly assigns data points exclusively to one class, or "across", which assigns class membership via runif. Alternatively, class membership may be pre-specified, either as a vector of class membership, or as a matrix of probabilities. Defaults to "single".
counts.n
If data patterns have already been counted, a data matrix consisting of each unique data pattern can be supplied to the function, in addition to a vector counts.n, which supplies the corresponding number of times each pattern occurs in the data.
fit
Previously fitted models may be supplied in order to approximate standard error and unbiased point estimates. fit should be an object of class "blca.em". Defaults to NULL if no object is supplied.
iter
The maximum number of iterations that the algorithm runs over, for each bootstrapped sample. Will stop earlier if the algorithm converges.
B
The number of bootstrap samples to run. Defaults to 100.
relabel
Logical valued. As the data is recursively sampled, it is possible that label-switching may occur with respect to parameter estimates. If TRUE, parameter estimates are checked at each iteration, and relabeled if necessary. Defaults to FALSE.
verbose
Logical valued. If TRUE, the current number of completed bootstrap samples is printed at regular intervals.
verbose.update
If verbose=TRUE, verbose.update determines the periodicity with which updates are printed.
small
To ensure numerical stability a small constant is added to certain parameter estimates. Defaults to 1e-100.
Details
Bootstrapping methods can be used to estimate properties of a distribution's parameters, such as the standard error estimates, by constructing multiple resamples of an observed dataset, obtained by sampling with replacement from said dataset. The multiple parameter estimates obtained from these resamples may then be analysed. This method is implemented in blca.boot by first running blca.em over the full data set and then using the returned values of the item and class probabilities as the initial values when running the algorithm for each bootstrapped sample. Alternatively, initial parameter estimates may be specified using the fit argument.
Note that if a previously fitted model is supplied, then the prior values with which the model was fitted will be used for the sampling run, regardless of the values supplied to the prior arguments.
Value
A list of class "blca.boot" is returned, containing:
call
The initial call passed to the function.
itemprob
The item probabilities, conditional on class membership.
classprob
The class probabilities.
Z
Estimate of class membership for each unique datapoint.
itemprob.sd
Posterior standard deviation estimates of the item probabilities.
classprob.sd
Posterior standard deviation estimates of the class probabilities.
classprob.initial, itemprob.initial
Initial parameter values for classprob and itemprob, used to run over each bootstrapped sample.
samples
A list containing the parameter estimates for each bootstrapped sample.
logpost
The log-posterior of the estimated model.
BIC
The Bayesian Information Criterion for the estimated model.
AIC
Akaike's Information Criterion for the estimated model.
label
Logical value, indicating whether label switching has been checked for.
counts
The number of times each unique datapoint point occured.
prior
A list containing the prior values specified for the model.
Note
Earlier versions of this function erroneously referred to posterior standard deviations as standard errors. This also extended to arguments supplied to and returned by the function, some of which are now returned with the corrected corrected suffix blca.em.sd (for standard deviation). For backwards compatability reasons, the earlier suffix .se has been retained as a returned argument.
Author(s)
Arthur White
References
Wasserman, L, 22nd May 2007, All of Nonparametric Statistics, Springer-Verlag.
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(BayesLCA)
Loading required package: e1071
Loading required package: coda
> png(filename="/home/ddbj/snapshot/RGM3/R_CC/result/BayesLCA/blca.boot.Rd_%03d_medium.png", width=480, height=480)
> ### Name: blca.boot
> ### Title: Bayesian Latent Class Analysis via an EM Algorithm and Using
> ### Empirical Bootstrapping
> ### Aliases: blca.boot
> ### Keywords: bootstrap blca
>
> ### ** Examples
>
> type1 <- c(0.8, 0.8, 0.2, 0.2)
> type2 <- c(0.2, 0.2, 0.8, 0.8)
> x <- rlca(1000, rbind(type1,type2), c(0.6,0.4))
> fit.boot <- blca.boot(x, 2)
Object 'fit' not supplied. Obtaining starting values via blca.em...
Restart number 1, logpost = -2453.41...
New maximum found... Restart number 2, logpost = -2453.41...
New maximum found... Restart number 3, logpost = -2453.41...
Restart number 4, logpost = -2453.41...
Restart number 5, logpost = -2453.41...
Starting values obtained...
Beginning bootstrapping run...
10 of 100 samples completed...
20 of 100 samples completed...
30 of 100 samples completed...
40 of 100 samples completed...
50 of 100 samples completed...
60 of 100 samples completed...
70 of 100 samples completed...
80 of 100 samples completed...
90 of 100 samples completed...
100 of 100 samples completed...
Bootstrap sampling run completed.
> summary(fit.boot)
__________________
Bayes-LCA
Diagnostic Summary
__________________
Hyper-Parameters:
Item Probabilities:
alpha:
Col 1 Col 2 Col 3 Col 4
Group 1 1 1 1 1
Group 2 1 1 1 1
beta:
Col 1 Col 2 Col 3 Col 4
Group 1 1 1 1 1
Group 2 1 1 1 1
Class Probabilities:
delta:
Group 1 Group 2
1 1
__________________
Method: Bootstrap
Number of Samples: 100
Log-Posterior: -2453.509
AIC: -4925.018
BIC: -4969.188
>
> fit <- blca.em(x, 2, se=FALSE)
Restart number 1, logpost = -2453.41...
Restart number 2, logpost = -2453.41...
Restart number 3, logpost = -2453.41...
Restart number 4, logpost = -2453.41...
Restart number 5, logpost = -2453.41...
> fit.boot <- blca.boot(x, 2, fit=fit)
Beginning bootstrapping run...
10 of 100 samples completed...
20 of 100 samples completed...
30 of 100 samples completed...
40 of 100 samples completed...
50 of 100 samples completed...
60 of 100 samples completed...
70 of 100 samples completed...
80 of 100 samples completed...
90 of 100 samples completed...
100 of 100 samples completed...
Bootstrap sampling run completed.
> fit.boot
MAP Estimates:
Item Probabilities:
Col 1 Col 2 Col 3 Col 4
Group 1 0.809 0.837 0.186 0.168
Group 2 0.245 0.204 0.779 0.795
Membership Probabilities:
Group 1 Group 2
0.551 0.449
Posterior Standard Deviation Estimates:
Item Probabilities:
Col 1 Col 2 Col 3 Col 4
Group 1 0.022 0.020 0.019 0.021
Group 2 0.024 0.024 0.026 0.024
Membership Probabilities:
Group 1 Group 2
0.022 0.022
> plot(fit.boot, which=1:4)
>
>
>
>
>
> dev.off()
null device
1
>