R: Shrinkage Estimators of Entropy, Mutual Information and...
entropy.shrink
R Documentation
Shrinkage Estimators of Entropy, Mutual Information and Related Quantities
Description
freq.shrink estimates the bin frequencies from the counts y
using a James-Stein-type shrinkage estimator, where the shrinkage target is the uniform distribution.
entropy.shrink estimates the Shannon entropy H of the random variable Y
from the corresponding observed counts y by plug-in of shrinkage estimate
of the bin frequencies.
KL.shrink computes a shrinkage estimate of the Kullback-Leibler (KL) divergence
from counts y1 and y2.
chi2.shrink computes a shrinkage version of the chi-squared statistic
from counts y1 and y2.
mi.shrink estimates a shrinkage estimate of mutual information of two random variables.
chi2indep.shrink computes a shrinkage version of the chi-squared statistic of independence
from a table of counts y2d.
the unit in which entropy is measured.
The default is "nats" (natural units). For
computing entropy in "bits" set unit="log2".
lambda.freqs
shrinkage intensity. If not specified (default) it is estimated in a James-Stein-type fashion.
lambda.freqs1
shrinkage intensity for first random variable. If not specified (default) it is estimated in a James-Stein-type fashion.
lambda.freqs2
shrinkage intensity for second random variable. If not specified (default) it is estimated in a James-Stein-type fashion.
verbose
report shrinkage intensity.
Details
The shrinkage estimator is a James-Stein-type estimator. It is essentially
a entropy.Dirichlet estimator, where the pseudocount is
estimated from the data.
For details see Hausser and Strimmer (2009).
Value
freqs.shrink returns a shrinkage estimate of the frequencies.
entropy.shrink returns a shrinkage estimate of the Shannon entropy.
KL.shrink returns a shrinkage estimate of the KL divergence.
chi2.shrink returns a shrinkage version of the chi-squared statistic.
mi.shrink returns a shrinkage estimate of the mutual information.
chi2indep.shrink returns a shrinkage version of the chi-squared statistic of independence.
In all instances the estimated shrinkage intensity is attached to the returned
value as attribute lambda.freqs.
Hausser, J., and K. Strimmer. 2009. Entropy inference and the James-Stein
estimator, with application to nonlinear gene association networks.
J. Mach. Learn. Res. 10: 1469-1484. Available online from
http://jmlr.csail.mit.edu/papers/v10/hausser09a.html.
# load entropy library
library("entropy")
# a single variable
# observed counts for each bin
y = c(4, 2, 3, 0, 2, 4, 0, 0, 2, 1, 1)
# shrinkage estimate of frequencies
freqs.shrink(y)
# shrinkage estimate of entropy
entropy.shrink(y)
# example with two variables
# observed counts for two random variables
y1 = c(4, 2, 3, 1, 10, 4)
y2 = c(2, 3, 7, 1, 4, 3)
# shrinkage estimate of Kullback-Leibler divergence
KL.shrink(y1, y2)
# half of the shrinkage chi-squared statistic
0.5*chi2.shrink(y1, y2)
## joint distribution example
# contingency table with counts for two discrete variables
y2d = rbind( c(1,2,3), c(6,5,4) )
# shrinkage estimate of mutual information
mi.shrink(y2d)
# half of the shrinkage chi-squared statistic of independence
0.5*chi2indep.shrink(y2d)