a string of characters to indicate the name of the MSA file to be read.
aa.to.upper
a logical value indicating whether amino acids should be converted to upper case (TRUE) or not (FALSE). Default is TRUE.
gap.to.dash
a logical value indicating whether the dot (.) and tilde (sim) gap symbols should be converted
to dash (-) character (TRUE) or not (FALSE). Default is TRUE.
Details
Initially, FASTA (for FAST-ALL) was the input format of the FASTA program, used for protein comparison and searching in databases.
Presently, FASTA format is a standard format for biological sequences.
The FASTA formatted file of a single sequence displays:
a single-line description beginning with a greater-than (>) symbol. The following word is the identifier.
followed by any number of lines, representing biological sequence.
For multiple alignments, the FASTA formatted sequences are concatenated to create a multiple
FASTA format.
Value
A object of class 'align', which is a named list whose elements correspond to sequences, in the form of character vectors.
Pearson WR and Lipman DJ (1988) Improved tools for biological sequence comparison.
Proc Natl Acad Sci U S A27:2444-2448.
See Also
read.fasta function from bio3d package. read.fasta function from seqinr package. read.FASTA function from aaMI package (archived).
Examples
# reading of the multiple sequence alignment of human GPCRS in FASTA format:
aln <- import.fasta(system.file("msa/human_gpcr.fa", package = "bios2mds"))