Test weither the selected tree by either BIC, AIC or CV procedure is significantly associated to the dependent variable or not, while adjusting for a confounding effect.
the maximal tree obtained by the function pltr.glm
xdata
the data frame used to build xtree
Y.name
the name of the dependent variable
X.names
the names of independent confounding variables to consider in the linear part of the glm
G.names
the names of independent variables to consider in the tree part of the hybrid glm.
B
the resampling size of the deviance difference
args.rpart
a list of options that control details of the rpart algorithm. minbucket: the minimum number of observations in any terminal <leaf> node; cp: complexity parameter (Any split that does not decrease the overall lack of fit by a factor of cp is not attempted); maxdepth: the maximum depth of any node of the final tree, with the root node counted as depth 0. ...
See rpart.control for further details
epsi
a treshold value to check the convergence of the algorithm
iterMax
the maximal number of iteration to consider
iterMin
the minimum number of iteration to consider
family
the glm family considered depending on the type of the dependent variable.
LB
a binary indicator with values TRUE or FALSE indicating weither the loading are balanced or not in the parallel computing
args.parallel
parameters of the parallelization. See mclapply for more details.
index
the size of the selected tree (by the functions best.tree.BIC.AIC or best.tree.CV) using one of the proposed criteria
verbose
Logical; TRUE for printing progress during the computation (helpful for debugging)
Value
A list of three elements:
p.value
The P-value of the selected tree
Timediff
The execution time of the test procedure
Badj
The number of samples used inside the the procedure
Author(s)
Cyprien Mbogning
References
Mbogning, C., Perdry, H., Toussile, W., Broet, P.: A novel tree-based procedure for deciphering the genomic spectrum of clinical disease entities. Journal of Clinical Bioinformatics 4:6, (2014)
Fan, J., Zhang, C., Zhang, J.: Generalized likelihood ratio statistics and WILKS phenomenon. Annals of Statistics
29(1), 153-193 (2001)
See Also
best.tree.bootstrap, best.tree.permute
Examples
## Not run:
## load the data set
data(data_pltr)
## set the parameters
args.rpart <- list(minbucket = 40, maxdepth = 10, cp = 0)
family <- "binomial"
Y.name <- "Y"
X.names <- "G1"
G.names <- paste("G", 2:15, sep="")
## build a maximal tree
fit_pltr <- pltr.glm(data_pltr, Y.name, X.names, G.names, args.rpart = args.rpart,
family = family,iterMax = 5, iterMin = 3)
##prunned back the maximal tree by BIC or AIC criterion
tree_select <- best.tree.BIC.AIC(xtree = fit_pltr$tree,data_pltr,Y.name,
X.names, family = family)
## Compute the p-value of the selected tree by BIC
args.parallel = list(numWorkers = 10, type = "PSOCK")
index = tree_select$best_index[[1]]
p_value <- p.val.tree(xtree = fit_pltr$tree, data_pltr, Y.name, X.names, G.names,
B = 100, args.rpart = args.rpart, epsi = 1e-3,
iterMax = 5, iterMin = 3, family = family, LB = FALSE,
args.parallel = args.parallel, index = index)
## End(Not run)