Last data update: 2014.03.03

R: Likelihood Acquired Directions
ladR Documentation

Likelihood Acquired Directions

Description

Method to estimate the central subspace, using inverse conditional mean and conditional variance functions.

Usage

lad(X, y, numdir = NULL, nslices = NULL, numdir.test = FALSE, ...)

Arguments

X

Data matrix with n rows of observations and p columns of predictors. The predictors are assumed to have a continuous distribution.

y

Response vector of n observations, possibly categorical or continuous. It is assumed categorical if nslices=NULL.

numdir

Integer between 1 and p. It is the number of directions of the reduction to estimate. If not provided then it will equal the number of distinct values of the categorical response.

nslices

Integer number of slices. It must be provided if y is continuous, and must be less than n. It is used to discretize the continuous response.

numdir.test

Boolean. If FALSE, core computes the reduction for the specific number of directions numdir. If TRUE, it does the computation of the reduction for the numdir directions, from 0 to numdir.

...

Other arguments to pass to GrassmannOptim.

Details

Consider a regression in which the response Y is discrete with support S_Y={1,2,...,h}. Following standard practice, continuous response can be sliced into finite categories to meet this condition. Let X_y in R^p denote a random vector of predictors distributed as X|(Y=y) and assume that X_y sim N(μ_y, Δ_y), y in S_Y. Let μ=E(X) and Σ=mathrm{Var}(X) denote the marginal mean and variance of X and let Δ=E(Δ_Y) denote the average covariance matrix. Given n_y independent observations of X_y, y in S_{Y}, the goal is to obtain the maximum likelihood estimate of the d-dimensional central subspace mathcal{S}_{Y|X}, which is defined informally as the smallest subspace such that Y is independent of X given its projection P_{mathcal{S}_{Y|X}}X onto mathcal{S}_{Y|X}.

Let \tilde{Σ} denote the sample covariance matrix of X, let \tilde{Δ}_y denote the sample covariance matrix for the data with Y=y, and let \tilde{Δ}=∑_{y=1}^{h} m_y \tilde{Δ}_y where m_y is the fraction of cases observed with Y=y. The maximum likelihood estimator of mathcal{S}_{Y|X} maximizes over mathcal{S} in mathcal{G}_{(d,p)} the log-likelihood function

L(mathcal{S})=frac{n}{2}log|P_{mathcal{S}} \tilde{Σ} P_{mathcal{S}}|_0 - frac{n}{2}log|\tilde{Σ}| - frac{1}{2}∑_{y=1}^{h} n_y log|P_{mathcal{S}} \tilde{Δ}_y P_{mathcal{S}}|_0,

where |A|_0 indicates the product of the non-zero eigenvalues of a positive semi-definite symmetric matrix A, P_{mathcal{S}} indicates the projection onto the subspace mathcal{S} in the usual inner product, and mathcal{G}_{(d,p)} is the set of all d-dimensional subspaces in R^p, called Grassmann manifold. The desired reduction is then hat{Γ}^{T}X. Once the dimension of the reduction subspace is estimated, the columns of hat{Γ} are a basis for the maximum likelihood estimate of mathcal{S}_{Y|X}.

The dimension d of the sufficient reduction is to be estimated. A sequential likelihood ratio test, and information criteria (AIC, BIC) are implemented, following Cook and Forzani (2009).

Value

This command returns a list object of class ldr. The output depends on the argument numdir.test. If numdir.test=TRUE, a list of matrices is provided corresponding to the numdir values (1 through numdir) for each of the parameters Γ, Δ, and Δ_y; otherwise, a single list of matrices for a single value of numdir. The output of loglik, aic, bic, numpar are vectors of numdir elements if numdir.test=TRUE, and scalars otherwise. Following are the components returned:

R

The reduction data-matrix of X obtained using the centered data-matrix X. The centering of the data-matrix of X is such that each column vector is centered around its sample mean.

Gammahat

Estimate of Γ

Deltahat

Estimate of Δ

Deltahat_y

Estimate of Δ_y

loglik

Maximized value of the LAD log-likelihood.

aic

Akaike information criterion value.

bic

Bayesian information criterion value.

numpar

Number of parameters in the model.

Author(s)

Kofi Placid Adragni <kofi@umbc.edu>

References

Cook RD, Forzani L (2009). Likelihood-based Sufficient Dimension Reduction, J. of the American Statistical Association, Vol. 104, No. 485, 197–208.

See Also

core, pfc

Examples

data(flea)
fit <- lad(X=flea[,-1], y=flea[,1], numdir=2, numdir.test=TRUE)
summary(fit)
plot(fit)

Results