Pair-wise overlaps can be done for two types of analyses. Firstly, each cross-validation iteration
can be considered within a single classification. This explores the feature ranking stability. Secondly, the
overlap may be considered between different classification results. This approach compares the feature ranking
commonality between different methods. Two types of commonality are possible to analyse. One summary is
the average pair-wise overlap between a level of the comparison factor and the other summary is the pair-wise
overlap of each level of the comparison factor that is not the reference level against the reference level.
The overlaps are converted to percentages and plotted as lineplots.
A sequence of thresholds of number of the best features to use for overlapping.
comparison
The aspect of the experimental design to compare. See Details section for a
detailed description.
referenceLevel
The level of the comparison factor to use as the reference to compare each
non-reference level to. If NULL, then each level has the
average pairwise overlap calculated to all other levels.
lineColourVariable
The slot name that different levels of are plotted as
different line colours.
lineColours
A vector of colours for different levels of the line colouring parameter. If NULL,
a default palette is used.
lineWidth
A single number controlling the thickness of lines drawn.
pointTypeVariable
The slot name that different levels of are plotted as
different point shapes on the lines.
pointSize
A single number specifying the diameter of points drawn.
legendLinesPointsSize
A single number specifying the size of the lines and points in the legend,
if a legend is drawn.
rowVariable
The slot name that different levels of are plotted as separate rows of lineplots.
columnVariable
The slot name that different levels of are plotted as separate columns of lineplots.
yMax
The maximum value of the percentage to plot.
fontSizes
A vector of length 6. The first number is the size of the title.
The second number is the size of the axes titles. The third number is
the size of the axes values. The fourth number is the size of the
legends' titles. The fifth number is the font size of the legend labels.
The sixth number is the font size of the titles of grouped plots, if any
are produced. In other words, when rowVariable or
columnVariable are not NULL.
title
An overall title for the plot.
xLabelPositions
Locations where to put labels on the x-axis.
yLabel
Label to be used for the y-axis of overlap percentages.
margin
The margin to have around the plot.
showLegend
If TRUE, a legend is plotted next to the plot. If FALSE, it is hidden.
plot
Logical. If TRUE, a plot is produced on the current graphics device.
parallelParams
An object of class MulticoreParam or SnowParam.
Details
Possible values for characteristics are "datasetName", "classificationName",
"selectionName", and "validation". If "None", then that graphical element is not used.
If comparison is "within", then the feature rankings are compared within a particular
analysis. The result will inform how stable the feature rankings are between different iterations of cross-validation for a particular analysis. If comparison is "classificationName", then the feature
rankings are compared across different classification algorithm types, for each level of "datasetName",
"selectionName" and "validation". The result will inform how stable the feature rankings
are between different classification algorithms, for every cross-validation scheme, selction algorithm and
dataset. If comparison is "selectionName", then the feature rankings are compared across different
feature selection algorithms, for each level of "datasetName", "classificationName" and
"validation". The result will inform how stable the feature rankings are between feature selection
classification algorithms, for every dataset, classification algorithm, and cross-validation scheme.
If comparison is "validation", then the feature rankings are compared across different
cross-validation schemes, for each level of "classificationName", "selectionName" and
"datasetName". The result will inform how stable the feature rankings are between different
cross-validation schemes, for every selection algorithm, classification algorithm and every dataset.
If comparison is "datasetName", then the feature rankings are compared across different datasets,
for each level of "classificationName", "selectionName" and "validation".
The result will inform how stable the feature rankings are between different datasets, for every
classification algorithm and every dataset. This could be used to consider if different experimental
studies have a highly overlapping feature ranking pattern.
Calculating all pair-wise set overlaps for a large cross-validation result can be time-consuming.
This stage can be done on multiple CPUs by providing the relevant options to parallelParams.
Value
An object of class ggplot and a plot on the current graphics device, if plot is TRUE.