Last data update: 2014.03.03

R: Create a RE-EM tree
REEMtreeR Documentation

Create a RE-EM tree

Description

Fit a RE-EM tree to data. This estimates a regression tree combined with a linear random effects model.

Usage

REEMtree(formula, data, random, subset=NULL, initialRandomEffects=rep(0,TotalObs), 
		ErrorTolerance=0.001, MaxIterations=1000, verbose=FALSE, tree.control=rpart.control(), 
		cv=TRUE, cpmin = 0.001, no.SE =1,
		lme.control=lmeControl(returnObject=TRUE), method="REML", correlation=NULL)

Arguments

formula

a formula, as in the lm or rpart function

data

a data frame in which to interpret the variables named in the formula (unlike in lm or rpart, this is not optional)

random

a description of the random effects, as a formula of the form ~1|g, where g is the grouping variable

subset

an optional logical vector indicating the subset of the rows of data that should be used in the fit. All observations are included by default.

initialRandomEffects

an optional vector giving initial values for the random effects to use in estimation

ErrorTolerance

when the difference in the likelihoods of the linear models of two consecutive iterations is less than this value, the RE-EM tree has converged

MaxIterations

maximum number of iterations allowed in estimation

verbose

if TRUE, the current estimate of the RE-EM tree will be printed after each iteration

tree.control

a list of control values for the estimation algorithm to replace the default values used to control the rpart algorithm. Defaults to an empty list.

cv

if TRUE then cross-validation will be used for estimating the tree at each iteration. Default is TRUE.

cpmin

complexity parameter used in building a tree before cross-validation

no.SE

number of standard errors used in pruning (0 if unused)

lme.control

a list of control values for the estimation algorithm to replace the default values returned by the function lmeControl. Defaults to an empty list.

method

whether the linear model should be estimated with ML or REML

correlation

an optional corStruct object describing the within-group correlation structure; the available classes are given in corClasses

Value

an object of class REEMtree

Author(s)

Rebecca Sela rsela@stern.nyu.edu

References

Sela, Rebecca J., and Simonoff, Jeffrey S., “RE-EM Trees: A Data Mining Approach for Longitudinal and Clustered Data”, Machine Learning (2011).

See Also

rpart, nlme, REEMtree.object, corClasses

Examples

data(simpleREEMdata)
REEMresult<-REEMtree(Y~D+t+X, data=simpleREEMdata, random=~1|ID)

# Estimation allowing for autocorrelation
REEMresult<-REEMtree(Y~D+t+X, data=simpleREEMdata, random=~1|ID, 
	correlation=corAR1())

# Random parameters model for the random effects
REEMresult<-REEMtree(Y~D+t+X, data=simpleREEMdata, random=~1+X|ID)

# Estimation with a subset
sub <- rep(c(rep(TRUE, 10), rep(FALSE, 2)), 50)
REEMresult<-REEMtree(Y~D+t+X, data=simpleREEMdata, random=~1|ID, 
	subset=sub)

# Dataset from the R library "AER"
data("Grunfeld", package = "AER")
REEMtree(invest ~ value + capital, data=Grunfeld, random=~1|firm)
REEMtree(invest ~ value + capital, data=Grunfeld, random=~1|firm, correlation=corAR1())
REEMtree(invest ~ value + capital, data=Grunfeld, random=~1+year|firm)
REEMtree(invest ~ value + capital, data=Grunfeld, random=~1|firm/year)

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(REEMtree)
Loading required package: nlme
Loading required package: rpart
> png(filename="/home/ddbj/snapshot/RGM3/R_CC/result/REEMtree/REEMtree.Rd_%03d_medium.png", width=480, height=480)
> ### Name: REEMtree
> ### Title: Create a RE-EM tree
> ### Aliases: REEMtree
> ### Keywords: tree models
> 
> ### ** Examples
> 
> data(simpleREEMdata)
> REEMresult<-REEMtree(Y~D+t+X, data=simpleREEMdata, random=~1|ID)
> 
> # Estimation allowing for autocorrelation
> REEMresult<-REEMtree(Y~D+t+X, data=simpleREEMdata, random=~1|ID, 
+ 	correlation=corAR1())
> 
> # Random parameters model for the random effects
> REEMresult<-REEMtree(Y~D+t+X, data=simpleREEMdata, random=~1+X|ID)
> 
> # Estimation with a subset
> sub <- rep(c(rep(TRUE, 10), rep(FALSE, 2)), 50)
> REEMresult<-REEMtree(Y~D+t+X, data=simpleREEMdata, random=~1|ID, 
+ 	subset=sub)
> 
> # Dataset from the R library "AER"
> data("Grunfeld", package = "AER")
> REEMtree(invest ~ value + capital, data=Grunfeld, random=~1|firm)
[1] "*** RE-EM Tree ***"
n= 220 

node), split, n, deviance, yval
      * denotes terminal node

1) root 220 3502993.0 133.3119  
  2) capital< 905.65 213  816514.8 116.7423  
    4) value< 2023.55 183  199637.3 101.4119 *
    5) value>=2023.55 30  311514.9 210.2577 *
  3) capital>=905.65 7  848559.4 637.5000 *
[1] "Estimated covariance matrix of random effects:"
            (Intercept)
(Intercept)    16102.04
[1] "Estimated variance of errors: 6560.76154717947"
[1] "Log likelihood:  -1285.68868990103"
> REEMtree(invest ~ value + capital, data=Grunfeld, random=~1|firm, correlation=corAR1())
[1] "*** RE-EM Tree ***"
n= 220 

node), split, n, deviance, yval
      * denotes terminal node

1) root 220 9709554.0 133.3119  
  2) value< 2023.55 183 1185539.0 140.4533 *
  3) value>=2023.55 37 3315838.0 475.4931  
    6) capital< 905.65 30 1049896.0 222.9857 *
    7) capital>=905.65 7  848559.4 204.8394 *
[1] "Estimated covariance matrix of random effects:"
            (Intercept)
(Intercept)    4.781951
[1] "Estimated variance of errors: 80619.1492573218"
[1] "Log likelihood:  -1178.27919592153"
> REEMtree(invest ~ value + capital, data=Grunfeld, random=~1+year|firm)
[1] "*** RE-EM Tree ***"
n= 220 

node), split, n, deviance, yval
      * denotes terminal node

1) root 220 3479458.0 129.2651  
  2) capital< 905.65 213  809492.6 113.5582 *
  3) capital>=905.65 7  847378.7 538.3319 *
[1] "Estimated covariance matrix of random effects:"
             (Intercept)         year
(Intercept) 5.214837e-09 4.910460e-06
year        4.910460e-06 6.604069e-03
[1] "Estimated variance of errors: 6914.0935068201"
[1] "Log likelihood:  -1297.95669421635"
> REEMtree(invest ~ value + capital, data=Grunfeld, random=~1|firm/year)
[1] "*** RE-EM Tree ***"
n= 220 

node), split, n, deviance, yval
      * denotes terminal node

1) root 220 2179231.000 133.3119  
  2) capital< 905.65 213  318877.000 116.7423  
    4) value< 2023.55 183    5278.213 101.4119 *
    5) value>=2023.55 30    8236.144 210.2577 *
  3) capital>=905.65 7   22435.070 637.5000 *
[1] "Estimated covariance matrix of random effects:"
            (Intercept)
(Intercept)    5493.977
[1] "Estimated variance of errors: 1066.78486808815"
[1] "Log likelihood:  -1285.68868990103"
> 
> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>