R: Amino Acid Metric Solution (Atchley et al 2005)
AAMetric.Atchley
R Documentation
Amino Acid Metric Solution (Atchley et al 2005)
Description
Atchley et al 2005 performed factor analysis on a set of Amino Acid Indices (AA54) and inferred
a 5 factor latent variable structure relating amino acid characteristics using SAS. Based on the
relationship between factors and variable descriptions, the latent variables are defined as
Factor1 (PAH): Polarity, Accessibility, Hydrophobicity; Factor2 (PSS): Propensity for Secondary Structure;
Factor3 (MS) : Molecular Size; Factor4 (CC): Codon Composition; Factor5 (EC): Electrostatic Charge.
AAMetric.Atchley are scores from the factor analysis which convey the similarities and differences
among amino acids (rows) for each latent variable (columns).
Format
Rows are alphabetized Amino Acids and the 5 columns are factors where
Factor1 (PAH): Polarity, Accessibility, Hydrophobicity; Factor2 (PSS): Propensity for Secondary Structure;
Factor3 (MS) : Molecular Size; Factor4 (CC): Codon Composition; Factor5 (EC): Electrostatic Charge.
Details
54 Amino Acid Indices were selected from www.genome.jp/aaindex to quantify physiochemical attributes.
Using Factor Analysis on 5 factors, interpretable latent variables were determined to quantify
Amino Acid attributes. These are the scores from the published factor analysis calculated by SAS.
The proportion of common variation for each factor are 42.3
Source
Atchley, W. R., Zhao, J., Fernandes, A. and Drueke, T. 2005. Solving the sequence "metric" problem: Proc. Natl. Acad. Sci. USA 102: 6395-6400.
References
Atchley, W. R . and Fernandes, A. 2005. Sequence signatures and the probabilistic identification of proteins in the Myc-Max-Mad network. Proc. Natl. Acad. Sci. USA 102: 6401-6406.
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(HDMD)
Loading required package: psych
Loading required package: MASS
> png(filename="/home/ddbj/snapshot/RGM3/R_CC/result/HDMD/AAMetric.Atchley.Rd_%03d_medium.png", width=480, height=480)
> ### Name: AAMetric.Atchley
> ### Title: Amino Acid Metric Solution (Atchley et al 2005)
> ### Aliases: AAMetric.Atchley
> ### Keywords: datasets
>
> ### ** Examples
>
> data(AAMetric.Atchley)
Warning message:
In data(AAMetric.Atchley) : data set 'AAMetric.Atchley' not found
> plot(AAMetric.Atchley[,1], AAMetric.Atchley[,2], pch = AminoAcids)
>
> cor(AAMetric, AAMetric.Atchley)
pah pss ms cc ec
pah 0.99530421 0.0736871113 -0.12587574 -0.11889767 0.03117132
pss 0.05875309 0.9826834483 -0.08629297 0.06208317 0.04304136
ms -0.31217404 -0.2978205891 0.37953125 -0.26553117 0.20126148
cc -0.16603923 0.0340235286 -0.16418210 0.98779448 -0.12086778
ec -0.04245232 -0.0001209467 -0.27474905 0.09063324 -0.66015554
>
>
>
>
>
> dev.off()
null device
1
>