Last data update: 2014.03.03

R: Molecular Mutual Information
MolecularMIR Documentation

Molecular Mutual Information

Description

Mutual information (MI) represents the interdependence of two discrete random variables. Thus MI quantifies the reduction in uncertainty of one variable given the knowledge of a second variable. Placing entropy values on the diagonal of a MI matrix forms a structure comparable to a covariance matrix appropriate for variability decomposition. MI identifies pairs of statistically dependent or coupled sites where MI=1 indicates complete coupling.

Usage

MolecularMI(x, type, normalized)

Arguments

x

matrix, vector, or list of aligned DNA or Amino Acid sequences. If matrix, rows must be sequences and columns individual characters of the alignment. vector and list structures will be coerced into this format.

type

"DNA", "AA", or "GroupAA" method for calculating and normalizing the entropy value for each column (site)

normalized

method of normalization. If "NULL" or not provided, MI[i,j] = H(x[i])+H(x[j])-H(x[i],x[j]) for i,j=1..n where n is the number of sites. Otherwise, MI is normalized by some leveling constant. see NMI

Value

nxn matrix of mutual information values (DNA, AA, GroupAA), where n is the number of sites in the alignment. The diagonal contains the entropy values for that site.

Author(s)

Lisa McFerrin

See Also

MolecularEntropy, NMI

Examples


data(bHLH288)
bHLH_Seq = bHLH288[,2]
bHLH.MIAA = MolecularMI(bHLH_Seq, "AA")
bHLH.MIFG = MolecularMI(bHLH_Seq, "GroupAA")

##Compare Entropy values
MolecularEntropy(bHLH_Seq, "AA")$H
diag(bHLH.MIAA)
diag(bHLH.MIFG)

plot(diag(bHLH.MIFG), type = "h", ylab="Functional Entropy", xlab="site")

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(HDMD)
Loading required package: psych
Loading required package: MASS
> png(filename="/home/ddbj/snapshot/RGM3/R_CC/result/HDMD/MolecularMI.Rd_%03d_medium.png", width=480, height=480)
> ### Name: MolecularMI
> ### Title: Molecular Mutual Information
> ### Aliases: MolecularMI
> 
> ### ** Examples
> 
> 
> data(bHLH288)
Warning message:
In data(bHLH288) : data set 'bHLH288' not found
> bHLH_Seq = bHLH288[,2]
> bHLH.MIAA = MolecularMI(bHLH_Seq, "AA")
> bHLH.MIFG = MolecularMI(bHLH_Seq, "GroupAA")
> 
> ##Compare Entropy values
> MolecularEntropy(bHLH_Seq, "AA")$H
[1] "Warning: Data set contains non-Amino Acid elements"
 [1] 0.54259103 0.44883025 0.89090476 0.84452658 0.73159258 0.58814523
 [7] 0.80581970 0.68842679 0.30343823 0.39552039 0.78921575 0.24862201
[13] 0.53058028 0.83901329 0.75605248 0.54445120 0.29267861 0.79649265
[19] 0.66141578 0.42816934 0.80558916 0.71555892 0.02703858 0.46609526
[25] 0.75256327 0.67780158 0.50486094 0.37247846 0.84418416 0.87541708
[31] 0.83007461 0.75547989 0.72573297 0.42963784 0.53763591 0.55703509
[37] 0.13229774 0.63502436 0.66488249 0.33354940 0.10677790 0.60162860
[43] 0.74506712 0.34903553 0.62899873 0.74030411 0.34102393 0.31918223
[49] 0.79573864 0.86153174 0.36257321
> diag(bHLH.MIAA)
 [1] 0.54259103 0.44883025 0.89090476 0.84452658 0.73159258 0.58814523
 [7] 0.80581970 0.68842679 0.30343823 0.39552039 0.78921575 0.24862201
[13] 0.53058028 0.83420580 0.75605248 0.54445120 0.29267861 0.79649265
[19] 0.66141578 0.42816934 0.80558916 0.71555892 0.02703858 0.46609526
[25] 0.75256327 0.67780158 0.50486094 0.37247846 0.84418416 0.87541708
[31] 0.83007461 0.75547989 0.72573297 0.42963784 0.53763591 0.55703509
[37] 0.13229774 0.63502436 0.66488249 0.33354940 0.10677790 0.60162860
[43] 0.74506712 0.34903553 0.62899873 0.74030411 0.34102393 0.31918223
[49] 0.79573864 0.86153174 0.36257321
> diag(bHLH.MIFG)
 [1] 0.45447502 0.34930341 0.79790124 0.77083695 0.66586067 0.71534908
 [7] 0.61812275 0.59869940 0.32889198 0.29586135 0.76226974 0.29421736
[13] 0.50861329 0.84541817 0.71961099 0.19956539 0.34624693 0.73310279
[19] 0.70752923 0.30724300 0.80360484 0.74399083 0.01112282 0.35003561
[25] 0.68121751 0.57195868 0.19879944 0.43772029 0.83787461 0.75657914
[31] 0.83258652 0.80500280 0.72930464 0.30902310 0.33107497 0.58966102
[37] 0.11433655 0.21223244 0.57275192 0.25262615 0.00000000 0.56579545
[43] 0.63427743 0.25473484 0.44804515 0.73195735 0.34735812 0.03103616
[49] 0.76331890 0.84368794 0.23303265
> 
> plot(diag(bHLH.MIFG), type = "h", ylab="Functional Entropy", xlab="site")
> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>