Builds on biglm package to automate fitting of linear models with data that do not fit into R memory. Also includes a forward step variable selection routine. None of the functions are bounded by the size of the training or validation datasets. Requires biglm package.
The function performs forward stepwise variable selection for linear models on any sized dataset, even if it does not fit into R memory. AIC, BIC, and MSE are the available criteria for variable selection. The variable that minimizes these metrics is selected each step until the specified number of variables are entered into the model. The selection starts with a NULL model and adds variables.
readinbigdata
(Package: allan) :
Create a Connection to Very Large Datasets for Linear Model Fitting
This function opens a connection to a dataset. It is typically used when the dataset used for fitting is too large to reside in R memory. Does not necessarily need to be used by end-user.
Reads in a small portion of the data and measures the amount of memory the portion occupies in R and then calculates the best size for each chunk based on available memory and additional overhead needed for calculations.