Tableplots from a large dataset can be generated very fast when the preprocessing stage is done only once. This function preprocesses the dataset, and returns an object that can be passed to tableplot. From this stage, tableplots are generated very fast, no matter on which column the data is sorted or how many row bins are chosen.
Usage
tablePrepare(x, name = NULL, dir = NULL, ...)
Arguments
x
data.frame or ffdf, will be transformed into an ffdf object.
name
name of the dataset
dir
directory to store the prepared object. If unspecified, the prepared object will not be saved, and the underlying data will be stored temporarily in options("fftempdir").
...
arguments passed to other methods (at the moment only overwrite from savePrepare)
Details
The function bin_data needs a prepared data.frame
Prepare transforms the supplied data into an ffdf object and calculates
the order of each of its columns. Knowing the order of the columns speeds up
the binning process consideratly, For large ffdf objects this may be a time consuming
step so it can be wise to call prepare before making a tableplot.
Value
a prepared object, including the data and order of each of the columns
Examples
# load diamonds dataset from ggplot2
require(ggplot2)
data(diamonds)
p <- tablePrepare(diamonds)
tableplot(p, nBins=200, sortCol=depth)
tableplot(p, nBins=50, sortCol=price)