R: MM-Estimates for Multivariate Regression with Bootstrap...
FRBmultiregMM
R Documentation
MM-Estimates for Multivariate Regression with Bootstrap Inference
Description
Computes MM-estimates for multivariate regression together with standard errors, confidence intervals
and p-values based on the Fast and Robust Bootstrap.
Usage
## S3 method for class 'formula'
FRBmultiregMM(formula, data=NULL, ...)
## Default S3 method:
FRBmultiregMM(X, Y, int = TRUE, R = 999, conf = 0.95,
control=MMcontrol(...), na.action=na.omit, ...)
Arguments
formula
an object of class formula; a symbolic description of the model to be fit.
data
data frame from which variables specified in formula are to be taken.
X
a matrix or data frame containing the explanatory variables.
Y
a matrix or data frame containing the response variables.
int
logical: if TRUE an intercept term is added to the model (unless it is already present in X)
R
number of bootstrap samples. Default is R=999.
conf
level of the bootstrap confidence intervals. Default is conf=0.95
control
a list with control parameters for tuning the MM-estimate and its computing algorithm,
see MMcontrol().
na.action
a function which indicates what should happen when the data contain NAs. Defaults to na.omit.
...
allows for specifying control parameters directly instead of via control
Details
Multivariate MM-estimates combine high breakdown point and high Gaussian efficiency. They are defined by first computing an S-estimate of regression,
then fixing the scale component of the error covariance estimate, and finally re-estimating the regression coefficients
and the shape part of the error covariance by a more efficient M-estimate (see Tatsuoka and Tyler (2000) for MM-estimates
in the special case of location/scatter estimation, and Van Aelst and Willems (2005) for S-estimates of multivariate regression).
Tukey's biweight is used for the loss functions. By default, the first loss function (in the S-estimate) is tuned in order to obtain 50% breakdown point.
The default tuning of the second loss function (M-estimate) ensures 95% efficiency at the normal model for the coefficient estimates.
The desired efficiency can be changed through argument control.
The computation is carried out by a call to MMest_multireg(), which first performs the fast-S algorithm
(see Sest_multireg) and does the M-part by reweighted least squares (RWLS) iteration.
See MMcontrol for some adjustable tuning parameters regarding the algorithm.
The result of this call is also returned as the value est.
The Fast and Robust Bootstrap (Salibian-Barrera and Zamar 2002) is used to calculate so-called
basic bootstrap confidence intervals and bias corrected and accelerated (BCa)
confidence intervals (Davison and Hinkley 1997, p.194 and p.204 respectively).
Apart from the intervals with the requested confidence level, the function also returns p-values for each coefficient
corresponding to the hypothesis that the actual coefficient is zero. The p-values are computed as
1 minus the smallest level for which the confidence intervals would include zero. Both BCa and basic bootstrap p-values in this sense are given.
The bootstrap calculation is carried out by a call to MMboot_multireg(), the result
of which is returned as the value bootest. Bootstrap standard errors are returned as well.
Note: Bootstrap samples which contain too few distinct observations with positive weights are discarded
(a warning is given if this happens). The number of samples actually used is returned via ROK.
In the formula-interface, a multivariate response is produced via cbind. For example cbind(x4,x5) ~ x1+x2+x3.
All arguments from the default method can also be passed to the formula method except for int (passing int explicitely
will produce an error; the inclusion of an intercept term is determined by formula).
The returned object inherits from class mlm such that the standard coef, residuals, fitted and predict functions can be used.
Value
An object of class FRBmultireg which extends class mlm and contains at least the following components:
coefficients
MM-estimates of the regression coefficients
residuals
the residuals, that is response minus fitted values
fitted.values
the fitted values.
Sigma
MM-estimate of the error covariance matrix
scale
MM-estimate of the size of the multivariate errors
weights
implicit weights corresponding to the MM-estimates (i.e. final weights in the RWLS procedure)
outFlag
outlier flags: 1 if the robust distance of the residual exceeds the .975 quantile of (the square root of)
the chi-square distribution with degrees of freedom equal to the dimension of the responses; 0 otherwise
SE
bootstrap standard errors corresponding the regression coefficients
cov
bootstrap covariance matrix corresponding to the regression coefficients (in vectorized form)
CI.bca.lower
a matrix containing the lower bounds of the bias corrected and accelerated confidence intervals for the regression coefficients.
CI.bca.upper
a matrix containing the upper bounds of the bias corrected and accelerated confidence intervals for the regression coefficients.
CI.basic.lower
a matrix containing the lower bounds of basic bootstrap intervals for the regression coefficients.
CI.basic.upper
a matrix containing the upper bounds of basic bootstrap intervals for the regression coefficients.
p.bca
a matrix containing the p-values based on the BCa confidence intervals for the regression coefficients.
p.basic
a matrix containing the p-values based on the basic bootstrap intervals for the regression coefficients.
est
MM-estimates as returned by the call to MMest_multireg()
bootest
bootstrap results for the MM-estimates as returned by the call to MMboot_multireg()
conf
a copy of the conf argument
method
a list with following components: est = character string indicating that MM-estimates were used,
bdp = a copy of bdp from the control argument, and eff = a copy of eff from the control argument
control
a copy of the control argument
X, Y
either copies of the respective arguments or the corresponding matrices produced from formula
ROK
number of bootstrap samples actually used (i.e. not discarded due to too few distinct observations
with positive weight)
Author(s)
Gert Willems, stefan Van Aelst and Ella Roelant
References
A.C. Davison and D.V. Hinkley (1997) Bootstrap Methods and their Application. Cambridge Series in
Statistical and Probabilistic Mathematics. Cambridge: Cambridge University Press.
M. Salibian-Barrera, S. Van Aelst and G. Willems (2008) Fast and robust
bootstrap. Statistical Methods and Applications, 17, 41-71.
M. Salibian-Barrera, R.H. Zamar (2002) Bootstrapping robust estimates of
regression. The Annals of Statistics, 30, 556-582.
K.S. Tatsuoka and D.E. Tyler (2000) The uniqueness of S and M-functionals under non-elliptical distributions.
The Annals of Statistics, 28, 1219-1243.
S. Van Aelst and G. Willems (2005) Multivariate regression S-estimators for robust estimation and
inference. Statistica Sinica, 15, 981-1001.
S. Van Aelst and G. Willems (2013). Fast and robust bootstrap for multivariate inference: The R package FRB. Journal of Statistical Software, 53(3), 1–32.
URL: http://www.jstatsoft.org/v53/i03/.
data(schooldata)
school.x <- data.matrix(schooldata[,1:5])
school.y <- data.matrix(schooldata[,6:8])
#computes MM-estimate and 95% confidence intervals
#based on 999 bootstrap samples:
MMres <- FRBmultiregMM(school.x, school.y, R=999, conf = 0.95)
#or, equivalently using the formula interface
## Not run: MMres <- FRBmultiregMM(cbind(reading,mathematics,selfesteem)~., data=schooldata,
R=999, conf = 0.95)
## End(Not run)
#the print method displays the coefficient estimates
MMres
#the summary function additionally displays the bootstrap standard errors and p-values
#("BCA" method by default)
summary(MMres)
summary(MMres, confmethod="basic")
#ask explicitely for the coefficient matrix:
MMres$coefficients
# or equivalently,
coef(MMres)
#For the error covariance matrix:
MMres$Sigma
#plot some bootstrap histograms for the coefficient estimates
#(with "BCA" intervals by default)
plot(MMres, expl=c("education", "occupation"), resp=c("selfesteem","reading"))
#plot bootstrap histograms for all coefficient estimates
plot(MMres)
#probably the plot-function has made a selection of coefficients to plot here,
#since 'all' was too many to fit on one page, see help(plot.FRBmultireg);
#this is platform-dependent
Results
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(FRB)
Loading required package: corpcor
Loading required package: rrcov
Loading required package: robustbase
Scalable Robust Estimators with High Breakdown Point (version 1.3-11)
> png(filename="/home/ddbj/snapshot/RGM3/R_CC/result/FRB/FRBmultiregMM.Rd_%03d_medium.png", width=480, height=480)
> ### Name: FRBmultiregMM
> ### Title: MM-Estimates for Multivariate Regression with Bootstrap
> ### Inference
> ### Aliases: FRBmultiregMM FRBmultiregMM.default FRBmultiregMM.formula
> ### Keywords: multivariate robust
>
> ### ** Examples
>
> data(schooldata)
> school.x <- data.matrix(schooldata[,1:5])
> school.y <- data.matrix(schooldata[,6:8])
>
> #computes MM-estimate and 95% confidence intervals
> #based on 999 bootstrap samples:
> MMres <- FRBmultiregMM(school.x, school.y, R=999, conf = 0.95)
> #or, equivalently using the formula interface
> ## Not run:
> ##D MMres <- FRBmultiregMM(cbind(reading,mathematics,selfesteem)~., data=schooldata,
> ##D R=999, conf = 0.95)
> ## End(Not run)
>
> #the print method displays the coefficient estimates
> MMres
Multivariate regression based on multivariate MM-estimates (bdp = 0.5, eff = 0.95)
Coefficients:
reading mathematics selfesteem
(intercept) 2.1957 2.7546 0.27534
education 0.1259 0.0490 -0.01146
occupation 5.0490 5.6821 1.63797
visit -0.0441 -0.0162 0.24373
counseling -0.7290 -0.7422 0.00646
teacher -0.1677 -0.2384 0.03407
>
> #the summary function additionally displays the bootstrap standard errors and p-values
> #("BCA" method by default)
> summary(MMres)
Multivariate regression based on MM-estimates (bdp = 0.5, eff = 0.95)
Response reading:
Residuals:
Min 1Q Median 3Q Max
-13.826 -1.560 0.416 2.891 27.861
Coefficients:
Estimate Std.Error p-value
(intercept) 2.1957 1.0671 0.05081 .
education 0.1259 0.0775 0.10568
occupation 5.0490 1.4323 0.00236 **
visit -0.0441 0.3989 0.90819
counseling -0.7290 0.1927 0.00000 ***
teacher -0.1677 0.1590 0.13618
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
p-values based on BCA method!
Response mathematics:
Residuals:
Min 1Q Median 3Q Max
-9.10 -2.08 -0.24 3.49 40.31
Coefficients:
Estimate Std.Error p-value
(intercept) 2.7546 1.1242 0.01092 *
education 0.0490 0.0807 0.54706
occupation 5.6821 1.3766 0.00000 ***
visit -0.0162 0.3600 0.96601
counseling -0.7422 0.2468 0.00292 **
teacher -0.2384 0.1662 0.06588 .
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
p-values based on BCA method!
Response selfesteem:
Residuals:
Min 1Q Median 3Q Max
-2.099 -0.585 0.135 0.883 4.925
Coefficients:
Estimate Std.Error p-value
(intercept) 0.27534 0.2855 0.37403
education -0.01146 0.0237 0.61675
occupation 1.63797 0.3224 0.00000 ***
visit 0.24373 0.0849 0.00311 **
counseling 0.00646 0.0710 0.89626
teacher 0.03407 0.0405 0.36489
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
p-values based on BCA method!
Robust residual scale: 1.82
Error covariance matrix estimate:
reading mathematics selfesteem
reading 10.56 9.84 2.12
mathematics 9.84 14.29 1.88
selfesteem 2.12 1.88 1.10
>
> summary(MMres, confmethod="basic")
Multivariate regression based on MM-estimates (bdp = 0.5, eff = 0.95)
Response reading:
Residuals:
Min 1Q Median 3Q Max
-13.826 -1.560 0.416 2.891 27.861
Coefficients:
Estimate Std.Error p-value
(intercept) 2.1957 1.0671 0.0521 .
education 0.1259 0.0775 0.1021
occupation 5.0490 1.4323 0.0020 **
visit -0.0441 0.3989 0.8909
counseling -0.7290 0.1927 0.0000 ***
teacher -0.1677 0.1590 0.3363
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
p-values based on Basic Bootstrap!
Response mathematics:
Residuals:
Min 1Q Median 3Q Max
-9.10 -2.08 -0.24 3.49 40.31
Coefficients:
Estimate Std.Error p-value
(intercept) 2.7546 1.1242 0.036 *
education 0.0490 0.0807 0.464
occupation 5.6821 1.3766 0.000 ***
visit -0.0162 0.3600 0.953
counseling -0.7422 0.2468 0.004 **
teacher -0.2384 0.1662 0.216
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
p-values based on Basic Bootstrap!
Response selfesteem:
Residuals:
Min 1Q Median 3Q Max
-2.099 -0.585 0.135 0.883 4.925
Coefficients:
Estimate Std.Error p-value
(intercept) 0.27534 0.2855 0.38238
education -0.01146 0.0237 0.60260
occupation 1.63797 0.3224 0.00000 ***
visit 0.24373 0.0849 0.00601 **
counseling 0.00646 0.0710 0.88689
teacher 0.03407 0.0405 0.31632
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
p-values based on Basic Bootstrap!
Robust residual scale: 1.82
Error covariance matrix estimate:
reading mathematics selfesteem
reading 10.56 9.84 2.12
mathematics 9.84 14.29 1.88
selfesteem 2.12 1.88 1.10
>
> #ask explicitely for the coefficient matrix:
> MMres$coefficients
reading mathematics selfesteem
(intercept) 2.19568722 2.75459147 0.275344103
education 0.12587721 0.04901664 -0.011459842
occupation 5.04902053 5.68212541 1.637972563
visit -0.04408342 -0.01622288 0.243728453
counseling -0.72898667 -0.74220282 0.006462549
teacher -0.16767503 -0.23841061 0.034070553
> # or equivalently,
> coef(MMres)
reading mathematics selfesteem
(intercept) 2.19568722 2.75459147 0.275344103
education 0.12587721 0.04901664 -0.011459842
occupation 5.04902053 5.68212541 1.637972563
visit -0.04408342 -0.01622288 0.243728453
counseling -0.72898667 -0.74220282 0.006462549
teacher -0.16767503 -0.23841061 0.034070553
> #For the error covariance matrix:
> MMres$Sigma
reading mathematics selfesteem
reading 10.556926 9.835050 2.124301
mathematics 9.835050 14.288473 1.877485
selfesteem 2.124301 1.877485 1.096821
>
> #plot some bootstrap histograms for the coefficient estimates
> #(with "BCA" intervals by default)
> plot(MMres, expl=c("education", "occupation"), resp=c("selfesteem","reading"))
>
> #plot bootstrap histograms for all coefficient estimates
> plot(MMres)
Warning message:
In plot.FRBmultireg(MMres) :
Number of plots too large to fit on the page, subset was selected: consider specifying (fewer)
variables in 'expl' and 'resp'; or enlarge graphics device; or set 'onepage=FALSE'
> #probably the plot-function has made a selection of coefficients to plot here,
> #since 'all' was too many to fit on one page, see help(plot.FRBmultireg);
> #this is platform-dependent
>
>
>
>
>
>
> dev.off()
null device
1
>