Input matrix of dimension n * p; each of the n rows is an observation vector of p variables. The intercept should be included in the first column as (1,...,1). If not, it is added.
Y
Response variable of length n.
maxordre
Number of variables to be ordered. Default is min(n/2-1,p/2-1).
ordre
Several possible algorithms to order the variables, ordre=c("bolasso","pval","pval_hd","FR"). "bolasso" uses the dyadic algorithm with the Bolasso technique dyadiqueordre, "pval" uses the p-values obtained with a regression on the full set of variables (only when p<n), "pval_hd" uses marginal regression, "FR" uses Forward Regression. Default is "bolasso".
var_nonselect
Number of variables that don't undergo feature selection. They have to be in the first columns of data. Default is 1, the selection is not performed on the intercept.
m
Number of bootstrapped iteration of the Lasso. Only use if the algorithm is set to "bolasso". Default is m=100.
showordre
If showordre=TRUE, show the variables being ordered at each step of the algorithm.
Details
Rank the variables of data taking into account the vector of observations Y and rearrange the input matrix following that order.
Value
data
A list containing:
X - The scaled matrix used in the algorithm, the first column being (1,...,1).
Y - the input response vector
means.X - Vector of means of the input data matrix.
sigma.X - Vector of variances of the input data matrix.
data_ord
Input data matrix rearranged by ORDREBETA
ORDRE
Gives the maxordre most important variables of the data matrix.
ORDREBETA
Gives the order on all the variables of the data matrix (either arbitrary completion of ORDRE -‘Bolasso’ and ‘FR’, or the true order -‘pval’ and ‘pval_hd’).
References
Multiple hypotheses testing for variable selection; F. Rohart 2011