Plot the marginal effect of an x-variable on the class probability
(classification), response (regression), mortality (survival), or
the expected years lost (competing risk) from a RF-SRC analysis. Users
can select between marginal (unadjusted, but fast) and partial plots
(adjusted, but slow).
An object of class (rfsrc, grow), (rfsrc, synthetic),
(rfsrc, predict), or (rfsrc, plot.variable). See the
examples below for illustration of the latter.
xvar.names
Names of the x-variables to be used.
which.class
For classification families, an integer or
character value specifying the class to focus on (defaults to the
first class). For competing risk families, an integer value between
1 and J indicating the event of interest, where J is
the number of event types. The default is to use the first event
type.
outcome.target
Character value for multivariate families
specifying the target outcome to be used. The default is to use the
first coordinate.
time
For survival families, the time at which the predicted
survival value is evaluated at (depends on surv.type).
surv.type
For survival families, specifies the predicted value.
See details below.
partial
Should partial plots be used?
show.plots
Should plots be displayed?
plots.per.page
Integer value controlling page layout.
granule
Integer value controlling whether a plot for a
specific variable should be treated as a factor and therefore given
as a boxplot. Larger values coerce boxplots.
sorted
Should variables be sorted by importance values.
nvar
Number of variables to be plotted. Default is all.
npts
Maximum number of points used when generating partial
plots for continuous variables.
smooth.lines
Use lowess to smooth partial plots.
subset
Vector indicating which rows of the x-variable matrix
x$xvar to use. All rows are used if not specified.
...
Further arguments passed to or from other methods.
Details
The vertical axis displays the ensemble predicted value, while
x-variables are plotted on the horizontal axis.
For regression, the predicted response is used.
For classification, it is the predicted class probability
specified by which.class.
For multivariate families, it is the predicted value of the
outcome specified by outcome.target and if that is a
classification outcome, by which.class.
For survival, the choices are:
Mortality (mort).
Relative frequency of mortality (rel.freq).
Predicted survival (surv), where the predicted
survival is for the time point specified using
time (the default is the median follow up time).
For competing risks, the choices are:
The expected number of life years lost (years.lost).
The cumulative incidence function (cif).
The cumulative hazard function (chf).
In all three cases, the predicted value is for the event type
specified by which.class. For cif and
chf the quantity is evaluated at the time point specified
by time.
For partial plots use partial=TRUE. Their interpretation are
different than marginal plots. The y-value for a variable X,
evaluated at X=x, is
where x_{i,o} represents the value for all other variables
other than X for individual i and hat{f} is the
predicted value. Generating partial plots can be very slow.
Choosing a small value for npts can speed up computational
times as this restricts the number of distinct x values used
in computing \tilde{f}.
For continuous variables, red points are used to indicate partial
values and dashed red lines indicate a smoothed error bar of +/- two
standard errors. Black dashed line are the partial values. Set
smooth.lines=TRUE for lowess smoothed lines. For discrete
variables, partial values are indicated using boxplots with whiskers
extending out approximately two standard errors from the mean.
Standard errors are meant only to be a guide and should be
interpreted with caution.
Partial plots can be slow. Setting npts to a smaller number
can help.
Author(s)
Hemant Ishwaran and Udaya B. Kogalur
References
Friedman J.H. (2001). Greedy function approximation: a gradient
boosting machine, Ann. of Statist., 5:1189-1232.
Ishwaran H., Kogalur U.B. (2007). Random survival forests for R,
Rnews, 7(2):25-31.
Ishwaran H., Kogalur U.B., Blackstone E.H. and Lauer M.S.
(2008). Random survival forests, Ann. App.
Statist., 2:841-860.
Ishwaran H., Gerds T.A., Kogalur U.B., Moore R.D., Gange S.J. and Lau
B.M. (2014). Random survival forests for competing risks. To appear
in Biostatistics.