Use FaFile() to create a reference to an indexed fasta
file. The reference remains open across calls to methods, avoiding
costly index re-loading.
FaFileList() provides a convenient way of managing a list of
FaFile instances.
Usage
## Constructors
FaFile(file, index=sprintf("%s.fai", file), ...)
FaFileList(...)
## Opening / closing
## S3 method for class 'FaFile'
open(con, ...)
## S3 method for class 'FaFile'
close(con, ...)
## accessors; also path(), index()
## S4 method for signature 'FaFile'
isOpen(con, rw="")
## actions
## S4 method for signature 'FaFile'
indexFa(file, ...)
## S4 method for signature 'FaFile'
scanFaIndex(file, ...)
## S4 method for signature 'FaFileList'
scanFaIndex(file, ..., as=c("GRangesList", "GRanges"))
## S4 method for signature 'FaFile'
seqinfo(x)
## S4 method for signature 'FaFile'
countFa(file, ...)
## S4 method for signature 'FaFile,GRanges'
scanFa(file, param, ...,
as=c("DNAStringSet", "RNAStringSet", "AAStringSet"))
## S4 method for signature 'FaFile,RangesList'
scanFa(file, param, ...,
as=c("DNAStringSet", "RNAStringSet", "AAStringSet"))
## S4 method for signature 'FaFile,missing'
scanFa(file, param, ...,
as=c("DNAStringSet", "RNAStringSet", "AAStringSet"))
## S4 method for signature 'FaFile'
getSeq(x, param, ...)
## S4 method for signature 'FaFileList'
getSeq(x, param, ...)
Arguments
con, x
An instance of FaFile or (for getSeq)
FaFileList.
file, index
A character(1) vector of the fasta or fasta index
file path (for FaFile), or an instance of class FaFile
or FaFileList (for scanFaIndex, getSeq).
param
An optional GRanges or
RangesList instance to select reads (and
sub-sequences) for input. See Methods, below.
...
Additional arguments.
For FaFileList, this can either be a single character
vector of paths to BAM files, or several instances of
FaFile objects.
For scanFa,FaFile,missing-method this can include
arguemnts to readDNAStringSet / readRNAStringSet /
readAAStringSet when param is ‘missing’.
rw
Mode of file; ignored.
as
A character(1) vector indicating the type of object to
return.
For scanFaIndex, default GRangesList, with
index information from each file is returned as an element of the
list.
For scanFa, default DNAStringSet.
GRangesList, index information is collapsed across files into
the unique index elements.
Objects from the Class
Objects are created by calls of the form FaFile().
Fields
The FaFile class inherits fields from the
RsamtoolsFile class.
Functions and methods
FaFileList inherits methods from
RsamtoolsFileList and SimpleList.
Opening / closing:
open.FaFile
Opens the (local or remote) path and
index files. Returns a FaFile instance.
close.FaFile
Closes the FaFilecon; returning
(invisibly) the updated FaFile. The instance may be
re-opened with open.FaFile.
Accessors:
path
Returns a character(1) vector of the fasta path name.
index
Returns a character(1) vector of fasta index name
(minus the '.fai' extension).
Methods:
indexFa
Visit the path in path(file) and create an
index file (with the extension ‘.fai’).
scanFaIndex
Read the sequence names and and widths of
recorded in an indexed fasta file, returning the information as a
GRanges object.
seqinfo
Consult the index file for defined sequences
(seqlevels()) and lengths (seqlengths()).
countFa
Return the number of records in the fasta file.
scanFa
Return the sequences indicated by param as a
DNAStringSet instance. seqnames(param)
selects the sequences to return; start(param) and
end{param} define the (1-based) region of the sequence to
return. Values of end(param) greater than the width of the
sequence cause an error; use seqlengths(FaFile(file)) to
discover sequence widths. When param is missing, all
records are selected. When length(param)==0 no records are
selected.
getSeq
Returns the sequences indicated by param from
the indexed fasta file(s) of file.
For the FaFile method, the return type is a
DNAStringSet. The getSeq,FaFile and
scanFa,FaFile,GRanges methods differ in that getSeq
will reverse complement sequences selected from the minus strand.
For the FaFileList method, the param argument must
be a GRangesList of the same length as file,
creating a one-to-one mapping between the ith element of
file and the ith element of param; the return type
is a SimpleList of DNAStringSet instances, with
elements of the list in the same order as the input elements.