Last data update: 2014.03.03
|
R: Read FASTA formated Sequences
read.fasta | R Documentation |
Read FASTA formated Sequences
Description
Read aligned or un-aligned sequences from a FASTA format file.
Usage
read.fasta(file, rm.dup = TRUE, to.upper = FALSE, to.dash=TRUE)
Arguments
file |
input sequence file.
|
rm.dup |
logical, if TRUE duplicate sequences (with the same
names/ids) will be removed.
|
to.upper |
logical, if TRUE residues are forced to uppercase.
|
to.dash |
logical, if TRUE ‘.’ gap characters are
converted to ‘-’ gap characters.
|
Value
A list with two components:
ali |
an alignment character matrix with a row per sequence and
a column per equivalent aminoacid/nucleotide.
|
ids |
sequence names as identifers.
|
Note
For a description of FASTA format see:
http://www.ebi.ac.uk/help/formats_frame.html.
When reading alignment files, the dash ‘-’ is interpreted as
the gap character.
Author(s)
Barry Grant
References
Grant, B.J. et al. (2006) Bioinformatics 22, 2695–2696.
Results
|