R Graphical Manual

Last data update: 2014.03.03

R: Read FASTA formated Sequences

read.fasta

R Documentation

Read FASTA formated Sequences

Read aligned or un-aligned sequences from a FASTA format file.

read.fasta(file, rm.dup = TRUE, to.upper = FALSE, to.dash=TRUE)

`file`	input sequence file.
`rm.dup`	logical, if TRUE duplicate sequences (with the same names/ids) will be removed.
`to.upper`	logical, if TRUE residues are forced to uppercase.
`to.dash`	logical, if TRUE ‘.’ gap characters are converted to ‘-’ gap characters.

A list with two components:

`ali`	an alignment character matrix with a row per sequence and a column per equivalent aminoacid/nucleotide.
`ids`	sequence names as identifers.

For a description of FASTA format see: http://www.ebi.ac.uk/help/formats_frame.html. When reading alignment files, the dash ‘-’ is interpreted as the gap character.