Last data update: 2014.03.03

R: Metric Solution for Amino Acid characters
FactorTransformR Documentation

Metric Solution for Amino Acid characters

Description

Based off the work done by Atchley et al 2005, Amino Acids are transformed into 5 metrics according to factor analysis scores representing Factor1 (PAH): Polarity, Accessibility, Hydrophobicity; Factor2 (PSS): Propensity for Secondary Structure; Factor3 (MS) : Molecular Size; Factor4 (CC): Codon Composition; Factor5 (EC): Electrostatic Charge. These numerics provide a biologically meaningful value that establishes a platform capable of handling rigorous statistical techniques such as analysis of variance, regression, discriminant analysis, etc.

Usage

FactorTransform(Source, Search = AminoAcids, Replace = AAMetric.Atchley, Factor = 1, bycol = TRUE, SeqName = NULL,  alignment=FALSE, fillblank=NA)

Arguments

Source

Vector, Matrix or List of Amino Acid Sequences using the single character abbreviation~

Search

Vector of symbols to search over. Default is the list of Amino Acids.

Replace

Vector or Matrix of values to replace Search items. Rows of Replace correspond to elements of Search when byCol = TRUE.

Factor

If Replace is a matrix, Factor designates which vector of Replace is used.

bycol

logical. Designates if Replace is oriented so that columns correspond to replaceable elements

SeqName

Vector of sequence names

alignment

if FALSE, result is a list. If TRUE result is a matrix and hanging rows are filled with fillblank

fillblank

if alignment is TRUE, trailing sites are filled with this value. Default is NA, but can be numeric.

Value

A list or matrix containing numeric representations of the sequences is returned. If alignment is FALSE, each sequence is a new element in the list containing a vector of values with length corresponding to the length of the original sequence. If alignment is TRUE, a matrix is returned with each row representing a sequence metric. If the sequence lengths were unequal, trailing blanks are specified by the fillblank parameter.

Author(s)

Lisa McFerrin

References

Atchley, W. R., Zhao, J., Fernandes, A. and Drueke, T. 2005. Solving the sequence "metric" problem: Proc. Natl. Acad. Sci. USA 102: 6395-6400.

See Also

lapply, replace

Examples


FactorTransform("HDMD", Replace= AAMetric.Atchley)

data(bHLH288)
bHLH_Seq = as.vector(bHLH288[,2])
bHLH_ccList = FactorTransform(bHLH_Seq, Factor=4)
bHLH_ms     = FactorTransform(bHLH_Seq, Factor=3, alignment=TRUE)

bHLH_ms[c(20:25, 137:147, 190:196, 220:229, 264:273),1:8]

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(HDMD)
Loading required package: psych
Loading required package: MASS
> png(filename="/home/ddbj/snapshot/RGM3/R_CC/result/HDMD/FactorTransform.Rd_%03d_medium.png", width=480, height=480)
> ### Name: FactorTransform
> ### Title: Metric Solution for Amino Acid characters
> ### Aliases: FactorTransform FactorTransform.default FactorTransform.vector
> 
> ### ** Examples
> 
> 
> FactorTransform("HDMD", Replace= AAMetric.Atchley)
$Seq1
[1]  0.3361654  1.0501506 -0.6631257  1.0501506

> 
> data(bHLH288)
Warning message:
In data(bHLH288) : data set 'bHLH288' not found
> bHLH_Seq = as.vector(bHLH288[,2])
> bHLH_ccList = FactorTransform(bHLH_Seq, Factor=4)
> bHLH_ms     = FactorTransform(bHLH_Seq, Factor=3, alignment=TRUE)
> 
> bHLH_ms[c(20:25, 137:147, 190:196, 220:229, 264:273),1:8]
             [,1]       [,2]       [,3]       [,4]       [,5]       [,6]
Seq20   1.5021086  1.5021086  0.5332237 -0.7330651 -0.7330651  2.2134612
Seq21   1.5021086  1.5021086  0.5332237 -0.7330651 -0.7330651  2.2134612
Seq22   1.5021086  1.5021086  1.5021086 -0.7330651 -0.7330651  2.2134612
Seq23   1.5021086  1.5021086 -1.5046185 -0.7330651 -0.7330651  1.2991286
Seq24   1.5021086  1.5021086 -1.5046185 -0.7330651 -0.7330651  1.2991286
Seq25   1.5021086  1.5021086 -1.5046185 -0.7330651 -0.7330651  1.2991286
Seq137  1.5021086  1.5021086  1.5021086  1.2991286 -1.6733690  1.2991286
Seq138  1.5021086  1.5021086  1.5021086  1.2991286 -1.6733690  1.2991286
Seq139  0.5332237  1.5021086 -0.7330651 -0.8620345 -1.6733690  1.2991286
Seq140  1.5021086  1.5021086 -0.7330651  1.2991286 -1.6733690  1.2991286
Seq141  0.5332237  1.5021086  1.2991286 -3.0048731 -1.6733690  1.2991286
Seq142  1.5021086  0.5332237  0.6719274  0.5332237 -1.6733690 -0.5440132
Seq143  1.2991286  1.5021086 -4.7596375 -4.7596375 -1.6733690  1.2991286
Seq144  1.2991286  1.5021086 -4.7596375 -4.7596375 -1.6733690  1.2991286
Seq145  3.0973596  1.5021086 -4.7596375  2.2134612 -1.6733690  1.2991286
Seq146  1.2991286  1.5021086 -4.7596375  2.2134612 -1.6733690  1.2991286
Seq147 -4.7596375  1.5021086 -4.7596375  2.2134612 -1.6733690  1.2991286
Seq190 -3.6559147  1.3301017 -0.5440132  2.2134612  0.5332237 -4.7596375
Seq191 -3.6559147  1.3301017 -0.5440132  2.1314349  0.5332237 -4.7596375
Seq192  1.5021086  0.5332237  1.4766610  0.5332237 -4.7596375  1.5021086
Seq193  1.5021086  0.5332237  1.4766610  0.5332237 -4.7596375  1.5021086
Seq194  1.5021086  0.5332237  1.4766610  0.5332237 -4.7596375  1.5021086
Seq195  1.5021086  0.5332237  1.4766610  0.5332237 -4.7596375  1.5021086
Seq196  1.5021086  0.5332237  1.4766610  0.5332237 -4.7596375  1.5021086
Seq220 -0.7330651  1.4766610 -0.7330651 -0.7330651 -0.7330651 -3.6559147
Seq221 -0.8620345  0.5332237  1.3301017  1.3301017  1.4766610  1.4766610
Seq222  0.5332237  1.3301017  1.3301017 -1.6283286 -1.5046185  1.4766610
Seq223 -1.6733690  0.5332237  1.3301017 -1.6283286  1.3301017 -0.5440132
Seq224  0.5332237  1.3301017  1.2991286 -1.6283286 -0.7330651 -0.7330651
Seq225 -1.6283286 -1.6283286 -4.7596375 -1.6283286  2.1314349 -1.5046185
Seq226 -0.8620345  0.5332237  2.2194787  1.4766610 -3.6559147  1.4766610
Seq227  1.2991286  0.5332237 -4.7596375 -1.6283286 -0.7330651 -1.5046185
Seq228  0.5332237  2.1314349 -1.6283286 -1.5046185 -1.5046185 -3.6559147
Seq229  0.5332237  2.1314349 -1.6283286 -1.5046185 -1.5046185 -3.6559147
Seq264  1.5021086  0.5332237  1.2991286 -1.5046185  0.5332237 -1.6283286
Seq265  1.5021086  0.5332237 -4.7596375 -1.5046185  0.5332237 -1.6283286
Seq266  1.5021086  0.5332237 -4.7596375 -1.5046185  0.5332237 -1.6283286
Seq267 -3.0048731  0.5332237  0.5332237  1.2991286  0.5332237 -1.6283286
Seq268  1.5021086  0.5332237  2.2134612  1.2991286  0.5332237 -1.6283286
Seq269  3.0973596  0.5332237 -1.5046185 -1.6283286 -1.6733690  1.5021086
Seq270  3.0973596  0.5332237 -1.5046185 -1.6283286 -1.6733690  1.5021086
Seq271  3.0973596  0.5332237 -1.5046185 -1.6283286 -1.6733690  1.5021086
Seq272  0.5332237  0.5332237 -0.5440132 -4.7596375  0.5332237 -1.6283286
Seq273 -0.8620345  0.6719274  2.1314349 -4.7596375  0.5332237 -1.6283286
             [,7]       [,8]
Seq20   2.2194787  1.5021086
Seq21   2.2194787  1.5021086
Seq22  -1.5046185  1.5021086
Seq23  -0.7330651  1.5021086
Seq24  -0.7330651  1.5021086
Seq25  -0.7330651  1.5021086
Seq137  2.1314349 -1.5046185
Seq138  1.5021086  2.2194787
Seq139 -0.5440132 -1.5046185
Seq140 -0.5440132 -1.5046185
Seq141 -3.6559147  2.2194787
Seq142 -3.0048731  2.2194787
Seq143  1.4766610 -1.5046185
Seq144  1.4766610 -1.5046185
Seq145  1.4766610 -1.5046185
Seq146  1.4766610 -1.5046185
Seq147  1.4766610  2.2194787
Seq190  1.2991286 -1.6283286
Seq191  1.2991286 -1.6283286
Seq192 -3.6559147 -0.7330651
Seq193 -3.6559147 -0.7330651
Seq194 -3.6559147 -0.7330651
Seq195 -3.6559147 -0.7330651
Seq196 -3.6559147 -0.7330651
Seq220  1.4766610 -1.6283286
Seq221 -1.6283286 -0.7330651
Seq222  1.4766610 -1.6283286
Seq223 -3.6559147  1.4766610
Seq224  1.4766610  1.4766610
Seq225  1.4766610 -3.6559147
Seq226 -3.6559147 -1.5046185
Seq227  1.4766610  1.4766610
Seq228  1.4766610 -3.0048731
Seq229  1.4766610 -3.0048731
Seq264 -1.5046185 -1.5046185
Seq265 -1.5046185 -1.5046185
Seq266 -1.5046185 -1.5046185
Seq267 -1.5046185  2.2194787
Seq268  2.1314349  2.2194787
Seq269 -1.5046185  2.1314349
Seq270 -1.5046185  2.1314349
Seq271 -1.5046185  2.1314349
Seq272 -1.5046185  2.2194787
Seq273 -1.5046185  2.2194787
> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>