Last data update: 2014.03.03

R: Forward search algorithm for outlier detection
forward.searchR Documentation

Forward search algorithm for outlier detection

Description

The forward search algorithm begins by selecting a homogeneous subset of cases based on a maximum likelihood criteria and continues to add individual cases at each iteration given an acceptance criteria. By default the function will add cases that contribute most to the likelihood function and that have the closest robust Mahalanobis distance, however model implied residuals may be included as well.

Usage

forward.search(data, model, criteria = c("GOF", "mah"), n.subsets = 1000,
  p.base = 0.4, print.messages = TRUE, ...)

## S3 method for class 'forward.search'
print(x, ncases = 10, stat = "GOF", ...)

## S3 method for class 'forward.search'
plot(x, y = NULL, stat = "GOF",
  main = "Forward Search", type = c("p", "h"), ylab = "obs.resid", ...)

Arguments

data

matrix or data.frame

model

if a single numeric number declares number of factors to extract in exploratory factor analysis. If class(model) is a sem (semmod), or lavaan (character), then a confirmatory approach is performed instead

criteria

character strings indicating the forward search method Can contain 'GOF' for goodness of fit distance, 'mah' for Mahalanobis distance, or 'res' for model implied residuals

n.subsets

a scalar indicating how many samples to draw to find a homogeneous starting base group

p.base

proportion of sample size to use as the base group

print.messages

logical; print how many iterations are remaining?

...

additional parameters to be passed

x

an object of class forward.search

ncases

number of final cases to print in the sequence

stat

type of statistic to use. Could be 'GOF', 'RMR', or 'gCD' for the model chi squared value, root mean square residual, or generalized Cook's distance, respectively

y

a null value ignored by plot

main

the main title of the plot

type

type of plot to use, default displays points and lines

ylab

the y label of the plot

Details

Note that forward.search is not limited to confirmatory factor analysis and can apply to nearly any model being studied where detection of influential observations is important.

Author(s)

Phil Chalmers rphilip.chalmers@gmail.com

See Also

gCD, GOF, LD, robustMD, setCluster

Examples


## Not run: 

#run all internal gCD and GOF functions using multiple cores
setCluster()

#Exploratory
nfact <- 3
(FS <- forward.search(holzinger, nfact))
(FS.outlier <- forward.search(holzinger.outlier, nfact))
plot(FS)
plot(FS.outlier)

#Confirmatory with sem
model <- sem::specifyModel()
  F1 -> Remndrs,    lam11
	  F1 -> SntComp,    lam21
	  F1 -> WrdMean,    lam31
	  F2 -> MissNum,    lam41
	  F2 -> MxdArit,    lam52
	  F2 -> OddWrds,    lam62
	  F3 -> Boots,      lam73
  F3 -> Gloves,     lam83
	  F3 -> Hatchts,    lam93
	  F1 <-> F1,   NA,     1
	  F2 <-> F2,   NA,     1
	  F3 <-> F3,   NA,     1


(FS <- forward.search(holzinger, model))
(FS.outlier <- forward.search(holzinger.outlier, model))
plot(FS)
plot(FS.outlier)

#Confirmatory with lavaan
model <- 'F1 =~  Remndrs + SntComp + WrdMean
F2 =~ MissNum + MxdArit + OddWrds
F3 =~ Boots + Gloves + Hatchts'

(FS <- forward.search(holzinger, model))
(FS.outlier <- forward.search(holzinger.outlier, model))
plot(FS)
plot(FS.outlier)



## End(Not run)

Results