R: Classification Validity Assessment by Table Deviance
tabdev
R Documentation
Classification Validity Assessment by Table Deviance
Description
Table deviance is a method to assess the quality of classifications
by calculating the clarity of the classification with respect to the original
data, as opposed to a dissimilarity or distance matrix representation
Usage
## Default S3 method:
tabdev(x,clustering,nitr=999,...)
## S3 method for class 'stride'
tabdev(x,taxa,...)
## S3 method for class 'tabdev'
summary(object,p=0.05,...)
Arguments
x
a matrix or data.frame of multivariate observations, with objects as rows,
and attributes as columns
clustering
a vector of integer cluster assignments, or an object of
class ‘clustering’ or ‘partana’
nitr
number of iterations to perform in calculating the probability of
obtaining as effective a classification as observed
taxa
a data.frame with samples as rows and species as columns
object
and object of class ‘tabdev’
p
the maximum probability threshold to list species in the summary table
...
ancillary arguments to maintain compatibility with generic summary function
Details
Tabdev calculates the concentration of values within clusters. For each
column, tabdev calculates the sum of values within classes and the sum within classes
divided by the sum of that column to get fractional sums by class. These values are
used to calculate the deviance of each row. Attributes that are widely dispersed
among classes exhibit high deviance; attributes that are concentrated within a single
class contribute zero deviance. An effective classification should exhibit low
deviance.
Tabdev then permutes the values within columns and calculates the probability of
observing as low a deviance as observed as
$$ (m+1)/(niter + 1)$$
where $m$ is the number of cases with as low or lower deviance as observed.
Value
a list with components:
spcdev
a data.frame with species, deviance, and probability as columns