Last data update: 2014.03.03

R: lz and lzstar person-fit statistics
lz, lzstarR Documentation

lz and lzstar person-fit statistics

Description

Compute the lz (Drasgow, Levine, and Williams, 1985) and the lzstar (Snijders, 2001) person-fit statistics.

Usage

lz(matrix,
   NA.method = "Pairwise", Save.MatImp = FALSE,
   IP = NULL, IRT.PModel = "2PL", Ability = NULL, Ability.PModel = "ML",
   mu = 0, sigma = 1)

lzstar(matrix,
       NA.method = "Pairwise", Save.MatImp = FALSE,
       IP = NULL, IRT.PModel = "2PL", Ability = NULL, Ability.PModel = "ML",
       mu = 0, sigma = 1)

Arguments

matrix

Data matrix of dichotomous item scores: Persons as rows, items as columns, item scores are either 0 or 1, missing values allowed.

NA.method

Method to deal with missing values. The default is pairwise elimination ("Pairwise"). Alternatively, simple imputation methods are also available. The options available are "Hotdeck", "NPModel" (default), and "PModel".

Save.MatImp

Logical. Save (imputted) data matrix to file? Default is FALSE.

IP

Matrix with previously estimated item parameters: One row per item, and three columns ([,1] item discrimination; [,2] item difficulty; [,3] lower-asymptote, also referred to as pseudo-guessing parameter).

In case no item parameters are available then IP=NULL.

IRT.PModel

Specify the IRT model to use in order to estimate the item parameters (only if IP=NULL). The options available are "1PL", "2PL" (default), and "3PL".

Ability

Vector with previoulsy estimated latent ability parameters, one per respondent, following the order of the row index of matrix.

In case no ability parameters are available then Ability=NULL.

Ability.PModel

Specify the method to use in order to estimate the latent ability parameters (only if Ability=NULL). The options available are "ML" (default), "BM", and "WL".

mu

Mean of the apriori distribution. Only used when method="BM". Default is 0.

sigma

Standard deviation of the apriori distribution. Only used when method="BM". Default is 1.

Details

Drasgow et al. (1985) introduced one of the most used person-fit statistics, lz. This statistic is the standardized log-likelihood of the respondent's response vector. lz is (supposed to be) asymptotically standard normally distributed.

The computation of lz requires that both item and ability parameters are available. Function lz allows to user to enter his/her own item and ability parameter estimates (variables IP and Ability, respectively). Alternatively, lz relies on functions available through the irtoys package for estimating the parameters. Specifically, the user can choose one from three possible IRT models to fit the data: IRT.PModel="1PL", IRT.PModel="2PL", or IRT.PModel="3PL". As for estimating the ability parameters there are three possible methods: Ability.PModel="ML" (maximum likelihood), Ability.PModel="BM" (Bayes modal), or Ability.PModel="WL" (weighted likelihood).

It was later observed by several researchers (e.g., Molenaar and Hoijtink, 1990) that the asymptotic approximation only holds when true ability values are used. This limitation was overcome by Snijders (2001), who further developed lz into the lzstar statistic. An accessible paper that thoroughly explains the basic principles behind lzstar is Magis, Raiche, and Beland (2012). It is important to realize that not all item and/or ability estimation procedures can be used when computing lzstar. In particular, the estimation of the ability parameters is constrained (see Snijders, 2001, Equation 5). The lzstar algorithm internally estimates the ability parameters accordingly for one of three possible methods: Ability.PModel="ML" (maximum likelihood), Ability.PModel="BM" (Bayes modal), or Ability.PModel="WL" (weighted likelihood), see Magis et al. (2012). The user may provide his or her own ability estimates in case they are available by means of other software. In this case it is necessary to specify the method that was used for the estimation (ML, BM, or WL) using the argument Ability.PModel.

Aberrant response behavior is (potentially) indicated by small values of lz/lzstar (i.e., in the left tail of the sampling distribution).

Missing values in matrix are dealt with by means of pairwise elimination by default. Alternatively, single imputation is also available. Three single imputation methods exist: Hotdeck imputation (NA.method = "Hotdeck"), nonparametric model imputation (NA.method = "NPModel"), and parametric model imputation (NA.method = "PModel"); see Zhang and Walker (2008).

  • Hotdeck imputation replaces missing responses of an examinee ('recipient') by item scores from the examinee which is closest to the recipient ('donor'), based on the recipient's nonmissing item scores. The similarity between nonmissing item scores of recipients and donors is based on the sum of absolute differences between the corresponding item scores. The donor's response pattern is deemed to be the most similar to the recipient's response pattern in the group, so item scores of the former are used to replace the corresponding missing values of the latter. When multiple donors are equidistant to a recipient, one donor is randomly drawn from the set of all donors.

  • The nonparametric model imputation method is similar to the hotdeck imputation, but item scores are generated from Bernoulli distributions with probabilities defined by donors with similar total score than the recipient (based on all items except the NAs).

  • The parametric model imputation method is similar to the hotdeck imputation, but item scores are generated from Bernoulli distributions with probabilities estimated by means of parametric IRT models (IRT.PModel = "1PL", "2PL", or "3PL"). Item parameters (IP) and ability parameters (Ability) may be provided for this purpose (otherwise the algorithm finds estimates for these parameters).

Value

An object of class "PerFit", which is a list with 12 elements:

PFscores

A list of length N (number of respondents) with the values of the person-fit statistic.

PFstatistic

The person-fit statistic used.

PerfVects

A message indicating whether perfect response vectors (all-0s or all-1s) were removed from the analysis.

ID.all0s

Row indices of all-0s response vectors removed from the analysis (if applicable).

ID.all1s

Row indices of all-1s response vectors removed from the analysis (if applicable).

matrix

The data matrix after imputation of missing values was performed (if applicable).

Ncat

The number of response categories (2 in this case).

IRT.PModel

The parametric IRT model used.

IP

The Ix3 matrix of estimated item parameters.

Ability.PModel

The method used to estimate abilities used.

Ability

The vector of N estimated ability parameters.

NAs.method

The imputation method used (if applicable).

Author(s)

Jorge N. Tendeiro j.n.tendeiro@rug.nl

References

Drasgow, F., Levine, M. V., and Williams, E. A. (1985) Appropriateness measurement with polychotomous item response models and standardized indices. British Journal of Mathematical and Statistical Psychology, 38(1), 67–86.

Karabatsos, G. (2003) Comparing the Aberrant Response Detection Performance of Thirty-Six Person-Fit Statistics. Applied Measurement In Education, 16(4), 277–298.

Magis, D., Raiche, G., and Beland, S. (2012) A didactic presentation of Snijders's l[sub]z[/sub] index of person fit with emphasis on response model selection and ability estimation. Journal of Educational and Behavioral Statistics, 37(1), 57–81.

Meijer, R. R., and Sijtsma, K. (2001) Methodology review: Evaluating person fit. Applied Psychological Measurement, 25(2), 107–135.

Molenaar, I. W., and Hoijtink, H. (1990) The many null distributions of person fit indices. Psychometrika, 55(1), 75–106.

Snijders, T. B. (2001) Asymptotic null distribution of person fit statistics with estimated person parameter. Psychometrika, 66(3), 331–342.

Zhang, B., and Walker, C. M. (2008) Impact of missing data on person-model fit and person trait estimation. Applied Psychological Measurement, 32(6), 466–479.

See Also

lzpoly

Examples

# Load the inadequacy scale data (dichotomous item scores):
data(InadequacyData)

# Compute the lz scores using a subsample of the first 200 response vectors:
lz.out <- lz(InadequacyData[1:200,])
# Use parameters estimated externally (in this case item parameters estimated by mirt):
mod <- mirt(InadequacyData[1:200,], 1)
ip.mirt <- coef(mod, IRTpars = TRUE, simplify = TRUE, digits = Inf)$items[,c('a', 'b', 'g')]
lz.out2 <- lz(InadequacyData[1:200,], IP = ip.mirt)

# Compute the lzstar scores using a subsample of the first 200 response vectors:
lzstar.out <- lzstar(InadequacyData[1:200,])

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(PerFit)
Loading required package: ltm
Loading required package: MASS
Loading required package: msm
Loading required package: polycor
Loading required package: mvtnorm
Loading required package: sfsmisc
Loading required package: mirt
Loading required package: stats4
Loading required package: lattice

Attaching package: 'mirt'

The following object is masked from 'package:ltm':

    Science

> png(filename="/home/ddbj/snapshot/RGM3/R_CC/result/PerFit/lz.Rd_%03d_medium.png", width=480, height=480)
> ### Name: lz, lzstar
> ### Title: lz and lzstar person-fit statistics
> ### Aliases: lz lzstar
> ### Keywords: univar
> 
> ### ** Examples
> 
> # Load the inadequacy scale data (dichotomous item scores):
> data(InadequacyData)
> 
> # Compute the lz scores using a subsample of the first 200 response vectors:
> lz.out <- lz(InadequacyData[1:200,])
> # Use parameters estimated externally (in this case item parameters estimated by mirt):
> mod <- mirt(InadequacyData[1:200,], 1)
 Iteration: 1, Log-Lik: -2880.608, Max-Change: 1.02544 Iteration: 2, Log-Lik: -2810.238, Max-Change: 0.49410 Iteration: 3, Log-Lik: -2802.405, Max-Change: 0.26430 Iteration: 4, Log-Lik: -2800.166, Max-Change: 0.16204 Iteration: 5, Log-Lik: -2799.153, Max-Change: 0.10914 Iteration: 6, Log-Lik: -2798.647, Max-Change: 0.07428 Iteration: 7, Log-Lik: -2798.190, Max-Change: 0.03321 Iteration: 8, Log-Lik: -2798.108, Max-Change: 0.02406 Iteration: 9, Log-Lik: -2798.055, Max-Change: 0.01761 Iteration: 10, Log-Lik: -2797.970, Max-Change: 0.00883 Iteration: 11, Log-Lik: -2797.956, Max-Change: 0.00597 Iteration: 12, Log-Lik: -2797.946, Max-Change: 0.00498 Iteration: 13, Log-Lik: -2797.915, Max-Change: 0.00335 Iteration: 14, Log-Lik: -2797.913, Max-Change: 0.00392 Iteration: 15, Log-Lik: -2797.911, Max-Change: 0.00309 Iteration: 16, Log-Lik: -2797.908, Max-Change: 0.00141 Iteration: 17, Log-Lik: -2797.908, Max-Change: 0.00120 Iteration: 18, Log-Lik: -2797.907, Max-Change: 0.00117 Iteration: 19, Log-Lik: -2797.906, Max-Change: 0.00153 Iteration: 20, Log-Lik: -2797.906, Max-Change: 0.00062 Iteration: 21, Log-Lik: -2797.905, Max-Change: 0.00072 Iteration: 22, Log-Lik: -2797.905, Max-Change: 0.00056 Iteration: 23, Log-Lik: -2797.905, Max-Change: 0.00051 Iteration: 24, Log-Lik: -2797.905, Max-Change: 0.00043 Iteration: 25, Log-Lik: -2797.905, Max-Change: 0.00037 Iteration: 26, Log-Lik: -2797.905, Max-Change: 0.00034 Iteration: 27, Log-Lik: -2797.905, Max-Change: 0.00031 Iteration: 28, Log-Lik: -2797.905, Max-Change: 0.00028 Iteration: 29, Log-Lik: -2797.905, Max-Change: 0.00026 Iteration: 30, Log-Lik: -2797.905, Max-Change: 0.00024 Iteration: 31, Log-Lik: -2797.905, Max-Change: 0.00023 Iteration: 32, Log-Lik: -2797.905, Max-Change: 0.00021 Iteration: 33, Log-Lik: -2797.905, Max-Change: 0.00020 Iteration: 34, Log-Lik: -2797.905, Max-Change: 0.00019 Iteration: 35, Log-Lik: -2797.905, Max-Change: 0.00018 Iteration: 36, Log-Lik: -2797.905, Max-Change: 0.00017 Iteration: 37, Log-Lik: -2797.905, Max-Change: 0.00016 Iteration: 38, Log-Lik: -2797.905, Max-Change: 0.00015 Iteration: 39, Log-Lik: -2797.905, Max-Change: 0.00015 Iteration: 40, Log-Lik: -2797.905, Max-Change: 0.00014 Iteration: 41, Log-Lik: -2797.905, Max-Change: 0.00013 Iteration: 42, Log-Lik: -2797.905, Max-Change: 0.00013 Iteration: 43, Log-Lik: -2797.905, Max-Change: 0.00012 Iteration: 44, Log-Lik: -2797.905, Max-Change: 0.00011 Iteration: 45, Log-Lik: -2797.905, Max-Change: 0.00056 Iteration: 46, Log-Lik: -2797.905, Max-Change: 0.00043 Iteration: 47, Log-Lik: -2797.905, Max-Change: 0.00033 Iteration: 48, Log-Lik: -2797.905, Max-Change: 0.00026 Iteration: 49, Log-Lik: -2797.905, Max-Change: 0.00008> ip.mirt <- coef(mod, IRTpars = TRUE, simplify = TRUE, digits = Inf)$items[,c('a', 'b', 'g')]
> lz.out2 <- lz(InadequacyData[1:200,], IP = ip.mirt)
> 
> # Compute the lzstar scores using a subsample of the first 200 response vectors:
> lzstar.out <- lzstar(InadequacyData[1:200,])
> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>