A data frame that consists of three components: the variables
object, attribute and rating. Each row of the data frame describes the outcome of a binary rater judgement
about the association between a certain object and a certain attribute.
object
The name of the object component in the data frame data. The values of the vector data$object should be (non-missing) numeric or character values.
attribute
The name of the attribute component in the data frame data. The values of the vector data$attribute should be (non-missing) numeric or character values.
rating
The name of the rating component in the data frame data. The elements of the vector data$rating should be the numeric values 0 (no association) or 1 (association),
or should be specified as missing (NA).
freq1
A J X K matrix of observed association frequencies.
freqtot
A J X K matrix with the total number of binary ratings in each cell (j,k). If the total number of ratings is the same for all cells of the matrix
it is sufficient to enter a single numeric value rather than a matrix. For instance, if N raters have judged J X K associations, one may specify freqtot=N
F
The number of latent features included in the model.
Nchains
The number of Markov-chains that are simulated using a data-augmented Gibbs sampling algorithm.
Nburnin
The number of burn-in iterations.
maxNiter
The maximum number of iterations that will be computed for each chain.
Nstep
The convergence of the chains to the true posterior will be checked for each parameter after c*Nstep iterations with c=1,2,...
The convergence will only be checked when Nchains>1.
Rhatcrit
The estimation procedure will be stopped if the Rhat convergence diagnostic is smaller than Rhatcrit
for each object- and attribute parameter. By default Rhatcrit=1.2.
maprule
Disjunctive (maprule="disj") or conjunctive (maprule="conj") mapping rule of the probabilistic latent feature model.
datatype
The type of data used as input. When datatype="freq" one should specify frequency data freq1 and freqtot, and when datatype="dataframe" one should
specify the name of the data frame data, and its components, object, attribute and rating.
start.bayes
This argument can be used to define the type of starting point for the Bayesian analysis. If start.bayes="best" the best solution of a plfm analysis
is used as the starting point for the Bayesian analysis, and if start.bayes = "fitted.plfm", the
starting point is read from the (plfm) object assigned to the argument
fitted.plfm. If start.bayes="random", a random starting point is used for the Bayesian analysis.
fitted.plfm
The name of the plfm object that contains posterior mode estimates for the specified model.
Details
The function bayesplfm can be used to compute a sample of the posterior
distribution of disjunctive or conjunctive probabilistic latent feature models with a particular number of features
using a data-augmented Gibbs sampling algorithm
(Meulders, De Boeck, Van Mechelen, Gelman, and Maris, 2001; Meulders, De Boeck, Van Mechelen, and Gelman, 2005; Meulders, 2013).
By specifying the parameter Nchains the function can be used to compute one single chain, or multiple chains.
When only one chain is computed, no convergence measure is reported. When more than one chain is computed, for each parameter,
convergence to the true posterior distribution is assessed using the Rhat convergence diagnostic proposed by Gelman and Rubin (1992).
When using bayesplfm for Bayesian analysis the same starting point will be used for each simulated chain. The reason for using the same
starting point for each of the chains is that the posterior distribution of probabilistic feature models with F>2 is always multimodal
(local maxima may exist, and one may switch feature labels), so that the aim of the Bayesian analysis is to compute a sample in the neigbourhood
of one specific posterior mode. It is recommended to use the best posterior mode obtained
with the plfm function as a starting point for the Bayesian analysis (use start.bayes="best", or specify start.bayes="fitted.plfm" and
fitted.plfm=object) with "object" being a plfm object that contains posterior mode estimates for the specified model. As an alternative to using the plfm(),
function one may use random starting points for the Bayesian analysis (start.bayes="random") to explore the posterior distribution.
The function bayesplfm() will converge well if the distinct posterior modes are well-separated and if the different chains only visit the same mode during the estimation process.
However, if the posterior distribution is multimodal, it may fail to converge if the Gibbs sampler starts visiting different posterior modes within
one chain, or if different chains sample from distinct posterior modes.
Value
call
Parameters used to call the function.
sample.objpar
A J X F X Niter X Nchains array with parameter values for the object parameters.
The matrix sample.objpar[,,i,c] contains the draw of the object parameters
in iteration i of chain c. Note: when Nchains=1 the chain length Niter equals maxNiter,
and when Nchains>1 the chain length Niter equals the number of iterations required to obtain convergence.
sample.attpar
A K X F X Niter X Nchains array with parameter values for the attribute parameters.
The matrix sample.attpar[,,i,c] contains the draw of the attribute parameters
in iteration i of chain c. Note: when Nchains=1 the chain length Niter equals maxNiter,
and when Nchains>1 the chain length Niter equals the number of iterations required to obtain convergence.
pmean.objpar
A J X F matrix with the posterior mean of the object parameters computed on all iterations and chains in the sample.
pmean.attpar
A K X F matrix with the posterior mean of the attribute parameters computed on all iterations and chains in the sample.
p95.objpar
A 3 X J X F array which contains for each object parameter the percentiles 2.5, 50 and 97.5.
p95.attpar
A 3 X K X F array which contains for each attribute parameter the percentiles 2.5, 50 and 97.5.
Rhat.objpar
A J X F matrix with Rhat convergence values for the object parameters.
Rhat.attpar
A K X F matrix with Rhat convergence values for the attribute parameters.
fitmeasures
A list with two measures of descriptive fit on the J X K table: (1) the correlation between observed and expected frequencies,
and (2) the proportion of the variance in the observed frequencies accounted for by the model.
The association probabilities and corresponding expected frequencies are computed using the posterior mean of the parameters.
convstat
The number of object-and attribute parameters that do not meet the convergence criterion.
Author(s)
Michel Meulders
References
Gelman, A., and Rubin, D. B. (1992). Inference from iterative simulation using multiple
sequences. Statistical Science, 7 , 457-472.
Meulders, M., De Boeck, P., Van Mechelen, I., Gelman, A., and Maris, E. (2001). Bayesian inference with probability matrix decomposition models.
Journal of Educational and Behavioral Statistics, 26, 153-179.
Meulders, M., De Boeck, P., Van Mechelen, I., and Gelman, A. (2005). Probabilistic feature analysis of facial perception of emotions.
Applied Statistics, 54, 781-793.
Meulders, M. (2013). An R Package for Probabilistic Latent Feature Analysis of Two-Way Two-Mode Frequencies. Journal of Statistical Software, 54(14), 1-29.
URL http://www.jstatsoft.org/v54/i14/.
See Also
plfm, summary.bayesplfm,print.summary.bayesplfm
Examples
## Not run:
## example 1: Bayesian analysis using data generated under the model
## define number of objects
J<-10
## define number of attributes
K<-10
## define number of features
F<-2
## generate true parameters
set.seed(43565)
objectparameters<-matrix(runif(J*F),nrow=J)
attributeparameters<-matrix(runif(K*F),nrow=K)
## generate data for conjunctive model using N=100 replications
gdat<-gendat(maprule="conj",N=100,
objpar=objectparameters,attpar=attributeparameters)
## Use stepplfm to compute posterior mode(s) for 1 up to 3 features
conj.lst<-stepplfm(minF=1,maxF=3,maprule="conj",freq1=gdat$freq1,freqtot=100,M=5)
## Compute a sample of the posterior distribution
## for the conjunctive model with two features
## use the posterior mode obtained with stepplfm as starting point
conjbayes2<-bayesplfm(maprule="conj",freq1=gdat$freq1,freqtot=100,F=2,
maxNiter=3000,Nburnin=0,Nstep=1000,Nchains=2,
start.bayes="fitted.plfm",fitted.plfm=conj.lst[[2]])
## End(Not run)
## Not run:
## example 2: Bayesian analysis of situational determinants of anger-related behavior
## load data
data(anger)
## Compute one chain of 500 iterations (including 250 burn-in iterations)
## for the disjunctive model with two features
## use a random starting point
bayesangerdisj2a<-bayesplfm(maprule="disj",freq1=anger$freq1,freqtot=anger$freqtot,F=2,
maxNiter=500,Nstep=500,Nburnin=250,Nchains=1,start.bayes="random")
##print a summary of the output
summary(bayesangerdisj2a)
## Compute a sample of the posterior distribution
## for the disjunctive model with two features
## compute starting points with plfm
## run 2 chains with a maximum length of 10000 iterations
## compute convergence after each 1000 iterations
bayesangerdisj2b<-bayesplfm(maprule="disj",freq1=anger$freq1,freqtot=anger$freqtot,F=2,
maxNiter=10000,Nburnin=0,Nstep=1000,Nchains=2,start.bayes="best")
## print the output of the disjunctive 2-feature model for the anger data
print(bayesangerdisj2b)
## print a summary of the output of the disjunctive 2-feature model
##for the anger data
summary(bayesangerdisj2b)
## End(Not run)