Last data update: 2014.03.03

R: TargetExperiment S4 class implementation in R
TargetExperiment-classR Documentation

TargetExperiment S4 class implementation in R

Description

This S4 class represents a Targeted Sequencing Experiment in R. Targeted Sequencing Experiments are characterized by a 'bed file' that contains the specification of the explored 'features' as a 'panel'. This features could be amplicons, exons, transcripts, among others. In general each feature is associated to one gene. A gene could be related to many features. This class allows the representation and quality control of a Targeted Sequencing Experiment.

Slots

scanBamP

ScanBamParam containing the information to scan the BAM file.

pileupP

PileupParam containing the information to build the pileup.

bedFile

GRanges object that models the bed file.

bamFile

BamFile object that is a reference to the BAM file.

fastaFile

FaFile object that is a reference to the reference sequence.

featurePanel

GRanges object that models the feature panel and related statistics.

genePanel

GRanges object that models the analyzed panel and related statistics at a gene level.

attribute

character indicates which attribute 'coverage' or 'medianCounts' will be used to the analysis.

feature

character indicates the name of the analyzed features. E.g 'amplicon', 'exon', 'transcript'.

Features

  1. Model Targeted Sequencing Experiments in R.

  2. Obtain coverage and read counts per sequenced feature.

  3. Evaluate the performance of a targeted sequencing experiment using coverage/read counts information.

  4. Detect in early stage sequencing or library preparation errors.

  5. Explore read profiles for particular features or genomic regions.

  6. Explore any kind of experiment in which 'feature' definition is possible for several genes. E.g RNA-seq experiments in which transcripts could be the 'features'.

  7. Report quality control results.

Functions

TargetExperiment S4 class includes the following functions:

pileupCounts

calculate pileup statistics for the BAM file

buildFeaturePanel

build and model a feature panel as a GRanges object and compute read statistics

summarizePanel

summarize the feature panel to a gene panel and compute read statistics

initialize

constructor of TargetExperiment to generate the feature and gene panels starting from an alignment BAM file and the bed file

getBedFile, getBamFile, getFeaturePanel, getGenePanel, getAttribute, getFeature, getScanBamP, getPileupP

return the respective TargetExperiment slot

setAttribute,setFeature, setScanBamP, setPileupP

return the respective TargetExperiment slots

show

generic output of the object

print

generic output of the object

summary

print statistics summary for the setted attribute

freqTable

build a frequency table of the attribute occurrence in user configured intervals

plot

plot a summarized view of the feature panel performance

plotAttrExpl

plot the density and distribution of the attribute

plotFeatPerform

plot the sequencing performance for each feature and/or gene

plotFeature

plot the reads profile for a particular feature

plotGeneAttrPerFeat

plot the explored attribute for each feature of a particular gene

plotNtdPercentages

plot nucleotide percentages for each position of a particular feature

plotRegion

plot the reads profile for a particular genomic region

readFrequencies

calculate frequencies of reads fall in and out of targeted regions

plotInOutFeatures

plot frequencies of reads fall in and out of targeted regions

biasExploration

plot attribute distributions along groups of bias sources

plotMetaDataExpl

plot density and box plots or frequency bar plot of metadata columns

addStatSummSheet

internal function to add the first sheet of xlsx reports

buildReport

build the experiment report as an xlsx file

Author(s)

Gabriela A. Merino gmerino@bdmg.com.ar, Cristobal Fresno cfresno@bdmg.com.ar and Elmer A. Fernandez efernandez@bdmg.com.ar examples:

## Defining bam file, bed file and fasta file names and paths bamFile<-system.file("extdata", "mybam.bam", package="TarSeqQC", mustWork=TRUE) bedFile<-system.file("extdata", "mybed.bed", package="TarSeqQC", mustWork=TRUE) fastaFile<-system.file("extdata", "myfasta.fa", package="TarSeqQC", mustWork=TRUE)

## Creating a TargetExperiment object

# Defining feature parameter feature<-"amplicon" # Defining attribute parameter attribute<-"coverage" ampliPanel<-TargetExperiment(bedFile, bamFile, fastaFile, attribute=attribute, feature=feature)

## Alternative object creation # Creating the TargetExperiment object ampliPanel<-TargetExperiment(bedFile, bamFile, fastaFile) # Set feature slot value setFeature(ampliPanel)<-"amplicon" # Set attribute slot value setAttribute(ampliPanel)<-"coverage" # Set pileupP slot value in order to set the maximum depth at 1000 setPileupP(ampliPanel)<-PileupParam(max_depth=1000) # Set the featurePanel slot but now using the new pileupP definition setFeaturePanel(ampliPanel)<-buildFeaturePanel(ampliPanel) ## Early exploration # show/print ampliPanel # summary summary(ampliPanel) # summary at feature level summaryFeatureLev(ampliPanel) # summary at gene level summaryGeneLev(ampliPanel) # attribute boxplot and density plot exploration g<-plotAttrExpl(ampliPanel,level="feature",join=TRUE, log=FALSE, color="blue") # x11(type="cairo") g # explore amplicon length distribution plotMetaDataExpl(ampliPanel, "length", log=FALSE, join=FALSE, color= "blueviolet") # explore gene's relative frequencies plotMetaDataExpl(ampliPanel, "gene", abs=FALSE) ## Deep exploration and Quality Control myfrequencies<-readFrequencies(ampliPanel) plotInOutFeatures(readFrequencies(ampliPanel)) # definition of the interval extreme values attributeThres<-c(0,1,50,200,500, Inf) # plot panel overview g<-plot(ampliPanel, attributeThres, chrLabels =TRUE) g # plot panel overview in a feature performance plot g<-plotFeatPerform(ampliPanel, attributeThres, complete=TRUE, log=FALSE, featureLabs=TRUE, sepChr=TRUE, legend=TRUE) g

# explore possible attribute bias x11(type="cairo") biasExploration(myPanel, source="gc", dens=TRUE) ## Controlling low counts features # Do a frequency table for the attribute intervals summaryIntervals(ampliPanel, attributeThres) # plotting attribute intervals plotAttrPerform(object) # getting low counts features at gene level getLowCtsFeatures(ampliPanel, level="gene", threshold=50) # getting low counts features at feature level getLowCtsFeatures(ampliPanel, level="feature", threshold=50) # exploring amplicon attribute values for a particular gene g<-plotGeneAttrPerFeat(ampliPanel, geneID="gene4") # adjust text size g<-g+theme(title=element_text(size=16), axis.title=element_text(size=16), legend.text=element_text(size=14)) g

##Obtain the pileup matrix for the first amplicon bed<-getBedFile(ampliPanel)[1] ## extracting the pileup matrix myCounts<-pileupCounts(bed, bamFile, fastaFile) head(myCounts)

# getting and exploring a sequenced region of a particular gene getRegion(ampliPanel, level="gene", ID="gene7", collapse=FALSE) # plot a particular genomic region g<-plotRegion(ampliPanel,region=c(4500,6800), seqname="chr10", SNPs=TRUE, xlab="", title="gene7 amplicons",size=0.5) # x11(type="cairo") g # exploring the read count profile for a particular amplicon g<-plotFeature(ampliPanel, featureID="AMPL20") # x11(type="cairo") g # exploring the nucleotide percentages compositions of the read counts for a # particular amplicon g<-plotNtdPercentage(ampliPanel,featureID="AMPL20") g ## Building the XLSX report imageFile<-system.file("extdata", "plot.pdf", package="TarSeqQC", mustWork=TRUE) buildReport(ampliPanel, attributeThres, imageFile ,file="Results.xlsx")

See Also

Rsamtools

Other TargetExperiment: TargetExperiment, ampliPanel, initialize, myCounts

Results