Last data update: 2014.03.03

R: plot of two-way model interactions
model.interaction.plotR Documentation

plot of two-way model interactions

Description

Image or Perspective plot of two-way model interactions. Ranges of two specified predictor variables are plotted on X and Y axis, and fitted model values are plotted on the Z axis. The remaining predictor variables are fixed at their mean (for continuous predictors) or their most common value (for categorical predictors).

Usage

model.interaction.plot(model.obj = NULL, x = NULL, y = NULL, 
response.category=NULL, quantiles=NULL, all=FALSE, obs=1, qdata.trainfn = NULL, 
folder = NULL, MODELfn = NULL,  PLOTfn = NULL, pred.means = NULL, 
xlab = NULL, ylab = NULL, x.range = NULL, y.range = NULL, 
z.range = NULL, ticktype = "detailed", theta = 55, phi = 40, 
smooth = "none", plot.type = NULL, device.type = NULL, 
res=NULL, jpeg.res = 72, device.width = 7,  device.height = 7, 
units="in", pointsize=12, cex=par()$cex, 
col = NULL, xlim = NULL, ylim = NULL, zlim = NULL, ...)

Arguments

model.obj

R model object. The model object to use for prediction. The model object must be of type "RF" (random forest), "QRF" (quantile random forest), "CF" (conditional forest), or "SGB" (stochastic gradient boosting).

x

String or Integer. Name of predictor variable to be plotted on the x axis. Alternativly, can be a number indicating a variable name from predList.

y

String or Integer. Name of predictor variable to be plotted on the y axis. Alternatively, can be a number indicating a variable name from predList.

response.category

String. Used for categorical response models. Specify which category of response variable to use. This category's probabilities will be plotted on the z axis.

quantiles

Numeric. Used for QRF models. Specify which quantile of response variable to use. This quantile will be plotted on the z axis. Note: unlike other functions model.interaction.plot will only use a single quantile. If quantiles is a vector only the first value will be used.

all

Logical. Used for QRF models. A logical value. all=TRUE uses all observations for prediction. all=FALSE uses only a certain number of observations per node for prediction (set with argument obs). The default is all=FALSE.

obs

Numeric. Used for QRF models. An integer number. Determines the maximal number of observations per node to use for prediction. The input is ignored for all=TRUE. The default is obs=1.

qdata.trainfn

String. The name (full path or base name with path specified by folder) of the training data file used for building the model (file should include columns for both response and predictor variables). The file must be a comma-delimited file *.csv with column headings. qdata.trainfn can also be an R dataframe. If predictions will be made (predict = TRUE or map=TRUE) the predictor column headers must match the names of the raster layer files, or a rastLUT must be provided to match predictor columns to the appropriate raster and band. If qdata.trainfn = NULL (the default), a GUI interface prompts user to browse to the training data file.

folder

String. The folder used for all output. Do not add ending slash to path string. If folder = NULL (default), a GUI interface prompts user to browse to a folder. To use the working directory, specify folder = getwd().

MODELfn

String. The file name used to save the generated model object, only used if PLOTfn = NULL. If MODELfn is supplied and If PLOTfn = NULL, a graphical file name is generated by pasting MODELfn_plot.type_x.name_y.name. If PLOTfn = NULL and MODELfn = NULL, a default name is generated by pasting model.type_response.type_response.name_plot.type_x.name_y.name. The filename can be the full path, or it can be the simple basename, in which case the output will be to the folder specified by folder.

PLOTfn

String. The file name to use to save the generated graphical plots. The filename can be the full path, or it can be the simple basename, in which case the output will be to the folder specified by folder.

pred.means

Vector. Allows specification of values for other predictor variables. If Null, other predictors are set to their mean value (for continuous predictors) or their most common value (for factored predictors).

xlab

String. Allows manual specification of the x label.

ylab

String. Allows manual specification of the y label.

x.range

Vector. Manual range specification for the x axis. Alternate argument name for xlim. Use one or the other. Do not provide both x.range and xlim.

y.range

Vector. Manual range specification for the y axis. Alternate argument name for ylim. Use one or the other. Do not provide both y.range and ylim.

z.range

Vector. Manual range specification for the z axis. Alternate argument name for zlim. Use one or the other. Do not provide both z.range and zlim.

ticktype

Character: "simple" draws just an arrow parallel to the axis to indicate direction of increase; "detailed" (default) draws normal ticks as per 2D plots. If X or y is factored, ticks will be drawn on both axes.

theta

Numeric. Angles defining the viewing direction. theta gives the azimuthal direction.

phi

Numeric. Angles defining the viewing direction. phi gives the colatitude.

smooth

String. controls smoothing of the predicted surface. Options are "none" (default), "model" which uses a glm model to smooth the surface, and "average" which applies a 3x3 smoothing average. Note: smoothing is not appropriate if X or y is factored.

plot.type

Character. "persp" gives a 3-D perspective plot. "image" gives an image plot.

device.type

String or vector of strings. Model validation. One or more device types for graphical output from model validation diagnostics.

Current choices:

"default" default graphics device
"jpeg" *.jpg files
"none" no graphics device generated
"pdf" *.pdf files
"postscript" *.ps files
"win.metafile" *.emf files
res

Integer. Model validation. Pixels per inch for jpeg, png, and tiff plots. The default is 72dpi, good for on screen viewing. For printing, suggested setting is 300dpi.

jpeg.res

Integer. Model validation. Deprecated. Ignored unless res not provided.

device.width

Integer. Model validation. The device width for diagnostic plots in inches.

device.height

Integer. Model validation. The device height for diagnostic plots in inches.

units

Model validation. The units in which device.height and device.width are given. Can be "px" (pixels), "in" (inches, the default), "cm" or "mm".

pointsize

Integer. Model validation. The default pointsize of plotted text, interpreted as big points (1/72 inch) at res ppi

cex

Integer. Model validation. The cex for diagnostic plots.

col

Vector. Color table to use for image plots ( see help file on image for details).

xlim

Vector. X limits. Alternate argument name for x.range. Use one or the other. Do not provide both x.range and xlim.

ylim

Vector. Y limits. Alternate argument name for y.range. Use one or the other. Do not provide both y.range and ylim.

zlim

Vector. Z limits. Alternate argument name for z.range. Use one or the other. Do not provide both z.range and zlim.

...

additional graphical parameters (see par).

Details

This function provides a diagnostic plot useful in visualizing two-way interactions between predictor variables. Two of the predictor variables from the model are used to produce a grid of possible combinations of predictor values over the range of both variables. The remaining predictor variables from the model are fixed at either their means (for continuous predictors) or their most common value (for categorical predictors). Model predictions are generated over this grid and plotted as the z axis.

This function works with both continuous and categorical predictors, though the perspective plot should be interpreted with care for categorical predictors. In particular, the smooth option is not appropriate if either of the two selected predictor variables is categorical.

For categorical response models, a particular value must be specified for the response using the response.category argument.

Author(s)

Elizabeth Freeman

References

This function is adapted from gbm.perspec version 2.9 April 2007, J Leathwick/J Elith. See appendix S3 from:

Elith, J., Leathwick, J. R. and Hastie, T. (2008). A working guide to boosted regression trees. Journal of Animal Ecology. 77:802-813.

Examples

###########################################################################
############################# Run this set up code: #######################
###########################################################################

# set seed:
seed=38

# Define training and test files:

qdata.trainfn = system.file("extdata", "helpexamples","DATATRAIN.csv", package = "ModelMap")
qdata.testfn = system.file("extdata", "helpexamples","DATATEST.csv", package = "ModelMap")

# Define folder for all output:
folder=getwd()	

########## Continuous Response, Categorical Predictors ############


#file name to store model:
MODELfn="RF_BIO_TCandNLCD"			

#predictors:
predList=c("TCB","TCG","TCW","NLCD")

#define which predictors are categorical:
predFactor=c("NLCD")

# Response name and type:
response.name="BIO"
response.type="continuous"

#identifier for individual training and test data points

unique.rowname="ID"


###########################################################################
########################### build model: ##################################
###########################################################################


### create model ###

model.obj = model.build( model.type="RF",
                       qdata.trainfn=qdata.trainfn,
                       folder=folder,		
                       unique.rowname=unique.rowname,		
                       MODELfn=MODELfn,
                       predList=predList,
                       predFactor=predFactor,
                       response.name=response.name,
                       response.type=response.type,
                       seed=seed,
                       na.action=na.roughfix
)

###########################################################################
###################### make interaction plots: ############################
###########################################################################

#########################
### Perspective Plots ###
#########################


### specify first and third  predictors in 'predList (both continuous) ###

model.interaction.plot(	model.obj,
			x=1,y=3, 
			main=response.name, 
			plot.type="persp", 
			device.type="default") 

## Not run: 
### specify predictors in 'predList' by name (one continuous one factored) ###

model.interaction.plot(model.obj,
			x="TCB", y="NLCD", 
			main=response.name, 
			plot.type="persp", 
			device.type="default") 


###################
### Image Plots ###
###################

### same as previous example, but image plot ###


l <- seq(100,0,length.out=101)
c <- seq(0,100,length.out=101)
col.ramp <- hcl(h = 120, c = c, l = l)
 

model.interaction.plot(	model.obj,
				x="TCB", y="NLCD",
				main=response.name,
				plot.type="image", 
				device.type="default",
				col = col.ramp) 


#########################
### 3-way Interaction ###
#########################

### use 'pred.means' argument to fix values of additional predictors ###

### factored 3rd predictor ###

interaction between TCG and TCW for 3 most common values of NLCD

nlcd<-levels(model.obj$predictor.data$NLCD)
nlcd.counts<-table(model.obj$predictor.data$NLCD)
nlcd.ordered<-nlcd[order(nlcd.counts,decreasing=TRUE)]

for(i in nlcd.ordered[1:3]){
	pred.means=list(NLCD=i)

	model.interaction.plot(	model.obj,
				x="TCG", y="TCW",  
				main=paste("NLCD=",i," (",nlcd.counts[i]," plots)", sep=""),
				pred.means=pred.means, 
				z.range=c(0,110),
				theta=290,
				plot.type="persp", 
				device.type="default") 
}



### continuos 3rd predictor ###


tcb<-seq(	min(model.obj$predictor.data$TCB),
		max(model.obj$predictor.data$TCB),
		length=3)

tcb<-signif(tcb,2)

for(i in tcb){
	pred.means=list(TCB=i)

	model.interaction.plot(	model.obj,
				x="TCG", y="TCW",  
				main=paste("TCB =",i),
				pred.means=pred.means, 
				z.range=c(0,120),
				theta=290,
				plot.type="persp", 
				device.type="default") 
}



### 4-way Interesting combos ###


tcb=c(1300,2900,3400)
nlcd=c(11,90,95)

for(i in 1:3){
	pred.means=list(TCB=tcb[i],NLCD=nlcd[i])

	model.interaction.plot(	model.obj,
				x="TCG", y="TCW",  
				main=paste("TCB =",tcb[i],"        NLCD =",nlcd[i]),
				pred.means=pred.means, 
				z.range=c(0,120),
				theta=290,
				plot.type="persp", 
				device.type="default") 
}

## End(Not run) #end dontRun

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(ModelMap)
Loading required package: randomForest
randomForest 4.6-12
Type rfNews() to see new features/changes/bug fixes.
Loading required package: raster
Loading required package: sp
Loading required package: rgdal
rgdal: version: 1.1-10, (SVN revision 622)
 Geospatial Data Abstraction Library extensions to R successfully loaded
 Loaded GDAL runtime: GDAL 1.11.3, released 2015/09/16
 Path to GDAL shared files: /usr/share/gdal/1.11
 Loaded PROJ.4 runtime: Rel. 4.9.2, 08 September 2015, [PJ_VERSION: 492]
 Path to PROJ.4 shared files: (autodetected)
 Linking to sp version: 1.2-3 
> png(filename="/home/ddbj/snapshot/RGM3/R_CC/result/ModelMap/model.interaction.plot.Rd_%03d_medium.png", width=480, height=480)
> ### Name: model.interaction.plot
> ### Title: plot of two-way model interactions
> ### Aliases: model.interaction.plot
> ### Keywords: models
> 
> ### ** Examples
> 
> ###########################################################################
> ############################# Run this set up code: #######################
> ###########################################################################
> 
> # set seed:
> seed=38
> 
> # Define training and test files:
> 
> qdata.trainfn = system.file("extdata", "helpexamples","DATATRAIN.csv", package = "ModelMap")
> qdata.testfn = system.file("extdata", "helpexamples","DATATEST.csv", package = "ModelMap")
> 
> # Define folder for all output:
> folder=getwd()	
> 
> ########## Continuous Response, Categorical Predictors ############
> 
> 
> #file name to store model:
> MODELfn="RF_BIO_TCandNLCD"			
> 
> #predictors:
> predList=c("TCB","TCG","TCW","NLCD")
> 
> #define which predictors are categorical:
> predFactor=c("NLCD")
> 
> # Response name and type:
> response.name="BIO"
> response.type="continuous"
> 
> #identifier for individual training and test data points
> 
> unique.rowname="ID"
> 
> 
> ###########################################################################
> ########################### build model: ##################################
> ###########################################################################
> 
> 
> ### create model ###
> 
> model.obj = model.build( model.type="RF",
+                        qdata.trainfn=qdata.trainfn,
+                        folder=folder,		
+                        unique.rowname=unique.rowname,		
+                        MODELfn=MODELfn,
+                        predList=predList,
+                        predFactor=predFactor,
+                        response.name=response.name,
+                        response.type=response.type,
+                        seed=seed,
+                        na.action=na.roughfix
+ )
mtry = 1  OOB error = 1819.048 
Searching left ...
Searching right ...
mtry = 2 	OOB error = 1907.638 
-0.04870136 0.05 
> 
> ###########################################################################
> ###################### make interaction plots: ############################
> ###########################################################################
> 
> #########################
> ### Perspective Plots ###
> #########################
> 
> 
> ### specify first and third  predictors in 'predList (both continuous) ###
> 
> model.interaction.plot(	model.obj,
+ 			x=1,y=3, 
+ 			main=response.name, 
+ 			plot.type="persp", 
+ 			device.type="default") 
maximum value =  116.61 
Error in dev.new(width = device.width, height = device.height, pointsize = pointsize) : 
  no suitable unused file name for pdf()
Calls: model.interaction.plot -> initialize.device -> dev.new
Execution halted