R: Class to contain Amplicon Variant Analyzer Output
AVASet-class
R Documentation
Class to contain Amplicon Variant Analyzer Output
Description
Container to store data imported from a project of Roche's Amplicon Variant Analyzer Software.
It stores all information into an extended version of the Biobase ExpressionSet.
Objects from the Class
Objects can be created by calls of the form AVASet(dirname, avaBin).
dirname is a character giving the proejct directory and avaBin is a
character giving the path to the AVA software installation (i.e. the
directory containing the doAmplicon binary). The constructor will
start the AVA software command line and import all necessary data.
If the AVA software is not installed on the same machine that runs
R, all data must be exported manually using the AVA Command
Line Interface (AVA-CLI). After having exported all text files, the constructor
AVASet(dirname, avaBin, file_sample, file_amp, file_reference, file_variant, file_variantHits)
can be used to import them. See the example below.
Finally, old project folders generated by AVA software < 2.6 can be
imported using AVASet(dirname). Where dirname is the path
to the project folder (i.e. a directory that contains the files
and subdirectories "Amplicons/ProjectDef/ampliconsProject.txt",
"Amplicons/Results/Variants/currentVariantDefs.txt",
"Amplicons/Results/Variants", "Amplicons/Results/Align").
Slots
assayData:
Object of class AssayData. Contains the number of reads and the total read depth for every variant and each
sample in forward and reverse direction. Its column number equals nrow(phenoData).
featureData:
Object of class AnnotatedDataFrame. Contains information about the type, position and reference of each
variant.
phenoData:
Object of class AnnotatedDataFrame. Contains the sample-IDs and name, annotation and group of the read data
for all samples. If available, the lane, pico titer plate(s) (PTP) or MID(s) of each sample are shown as well.
assayDataAmp:
Object of class AssayData. Contains the number of reads for every amplicon and each sample in forward/reverse
direction. Its column number equals nrow(featureDataAmp).
featureDataAmp:
Object of class AnnotatedDataFrame. Contains the primer sequences, reference sequences and the coordinates
of the target regions for every amplicon.
referenceSequences:
Object of class
AlignedRead. If additional alignment information were computed via
alignShortReads, this slot knows about the chromosome, position and the strand of each reference sequence.
variantFilterPerc:
Object of class numeric. Contains a threshold to display only those variants, whose
coverage (in percent) in forward and reverse direction in at least one sample is higher than this filter value. See
setVariantFilter for details about setting this value.
variantFilter:
Object of class character. Contains a vector of variant names whose
coverage (in percent) in forward and reverse direction in at least one sample is higher than the filter value in
variantFilterPerc.
dirs:
Object of class character. Based on a directory given at instantiation of the object, it contains a vector of several
directories containing all relevant AVA-project files.
experimentData:
Object of class MIAME. Contains details of the experiment.
annotation:
Object of class character. Label associated with the annotation package used in the experiment.
protocolData:
Object of class annotatedDataFrame. Contains additional information about the samples.
.__classVersion__:
Object of class Versions. Remembers the R and R453Toolbox version numbers used to created the
AVASet instance.
Extends
Class eSet, directly.
Class VersionedBiobase, by class "eSet", distance 2.
Class Versioned, by class "eSet", distance 3.
Methods
object[i,j]:
Allows subsetting an AVASet object by features (i) and samples (j).
Retrieve the chromosomal positions of the amplicon
sequences.
setVariantFilter(object):
Sets the filter to display only those variants, whose coverage (in percent) in forward and reverse
direction in at least one sample is higher than the given value.
getVariantPercentages(object)
Computes the coverage for every variant over all reads (forward and/or reverse) and for each
sample.
annotateVariants(object):
Annotates given genomic variants. See annotateVariants for details.
htmlReport(object):
Exports all (filtered) variant data into a html report. See htmlReport for details
# sum up class structure
showClass("AVASet")
# load an AVA dataset containing 6 samples, 4 amplicons and 259 variants
data(avaSetExample)
avaSetExample
# show contents of assay, feature and pheno data
head(assayData(avaSetExample)$variantForwCount)
head(assayData(avaSetExample)$totalForwCount)
head(assayData(avaSetExample)$variantRevCount)
head(assayData(avaSetExample)$totalRevCount)
head(fData(avaSetExample))
pData(avaSetExample)
assayDataAmp(avaSetExample)
fDataAmp(avaSetExample)
referenceSequences(avaSetExample)
# Use these commands to export a project from within the AVA-CLI (doAmplicon):
# > list sample -outputFile sample.csv
# > list amplicon -outputFile amp.csv
# > list reference -outputFile reference.csv
# > list variant -outputFile variant.csv
# > report variantHits -outputFile variantHits.csv
# Load an AVA dataset containing 6 samples, 4 amplicons and 222 variants
# by specifying five files, that were exported with the AVA-CLI:
projectDir = system.file("extdata", "AVASet_doAmplicon", package="R453Plus1Toolbox")
avaSetExample = AVASet(dirname=projectDir, file_sample="sample.csv", file_amp="amp.csv", file_reference="reference.csv", file_variant="variant.csv", file_variantHits="variantHits.csv")
Results
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(R453Plus1Toolbox)
Loading required package: VariantAnnotation
Loading required package: BiocGenerics
Loading required package: parallel
Attaching package: 'BiocGenerics'
The following objects are masked from 'package:parallel':
clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
clusterExport, clusterMap, parApply, parCapply, parLapply,
parLapplyLB, parRapply, parSapply, parSapplyLB
The following objects are masked from 'package:stats':
IQR, mad, xtabs
The following objects are masked from 'package:base':
Filter, Find, Map, Position, Reduce, anyDuplicated, append,
as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
rbind, rownames, sapply, setdiff, sort, table, tapply, union,
unique, unsplit
Loading required package: GenomeInfoDb
Loading required package: stats4
Loading required package: S4Vectors
Attaching package: 'S4Vectors'
The following objects are masked from 'package:base':
colMeans, colSums, expand.grid, rowMeans, rowSums
Loading required package: IRanges
Loading required package: GenomicRanges
Loading required package: SummarizedExperiment
Loading required package: Biobase
Welcome to Bioconductor
Vignettes contain introductory material; view with
'browseVignettes()'. To cite Bioconductor, see
'citation("Biobase")', and for packages 'citation("pkgname")'.
Loading required package: Rsamtools
Loading required package: Biostrings
Loading required package: XVector
Attaching package: 'VariantAnnotation'
The following object is masked from 'package:base':
tabulate
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/R453Plus1Toolbox/AVASet-class.Rd_%03d_medium.png", width=480, height=480)
> ### Name: AVASet-class
> ### Title: Class to contain Amplicon Variant Analyzer Output
> ### Aliases: AVASet-class [,AVASet,ANY,ANY-method
> ### annotateVariants,AVASet-method assayDataAmp,AVASet-method
> ### assayDataAmp<- assayDataAmp<-,AVASet,AssayData-method
> ### fDataAmp,AVASet-method featureDataAmp,AVASet-method featureDataAmp<-
> ### featureDataAmp<-,AVASet,AnnotatedDataFrame-method
> ### htmlReport,AVASet-method
> ### alignShortReads,AVASet,DNAStringSet,character-method
> ### referenceSequences,AVASet-method referenceSequences<-
> ### referenceSequences<-,AVASet,AlignedRead-method
> ### setVariantFilter,AVASet-method getVariantPercentages,AVASet-method
> ### Keywords: classes
>
> ### ** Examples
>
>
> # sum up class structure
> showClass("AVASet")
Class "AVASet" [package "R453Plus1Toolbox"]
Slots:
Name: assayDataAmp featureDataAmp referenceSequences
Class: AssayData AnnotatedDataFrame AlignedRead
Name: variantFilterPerc variantFilter dirs
Class: numeric character character
Name: assayData phenoData featureData
Class: AssayData AnnotatedDataFrame AnnotatedDataFrame
Name: experimentData annotation protocolData
Class: MIAxE character AnnotatedDataFrame
Name: .__classVersion__
Class: Versions
Extends:
Class "eSet", directly
Class "VersionedBiobase", by class "eSet", distance 2
Class "Versioned", by class "eSet", distance 3
>
> # load an AVA dataset containing 6 samples, 4 amplicons and 259 variants
> data(avaSetExample)
> avaSetExample
Variants:
AVASet (storageMode: list)
assayData: 259 features, 6 samples
element names: variantForwCount, totalForwCount, variantRevCount, totalRevCount
protocolData: none
phenoData
sampleNames: Sample_1 Sample_2 ... Sample_6 (6 total)
varLabels: SampleID MID1 ... Annotation (7 total)
varMetadata: labelDescription
featureData
featureNames: C1438 C369 ... C763 (259 total)
fvarLabels: name canonicalPattern ... referenceBases (7 total)
fvarMetadata: labelDescription
experimentData: use 'experimentData(object)'
Annotation:
Amplicons:
assayDataAmp:4 features, 6 samples
element names:forwCountrevCount
featureDataAmp:
An object of class 'AnnotatedDataFrame'
rowNames: TET2_E11.04 TET2_E06 TET2_E11.03 TET2_E04
varLabels: ampID primer1 ... targetStart (6 total)
varMetadata: labelDescription
Reference sequences:
class: AlignedRead
length: 4 reads; width: 339..346 cycles
chromosome: NA NA NA NA
position: 1 1 1 1
strand: NA NA NA NA
alignQuality: NumericQuality
alignData varLabels: name refSeqID gene
>
> # show contents of assay, feature and pheno data
> head(assayData(avaSetExample)$variantForwCount)
Sample_1 Sample_2 Sample_3 Sample_4 Sample_5 Sample_6
C1438 0 0 0 0 0 0
C369 0 0 0 1 0 0
C595 0 0 0 0 0 0
C397 0 0 0 0 0 0
C30 0 5 0 0 0 0
C1699 0 0 0 0 0 0
> head(assayData(avaSetExample)$totalForwCount)
Sample_1 Sample_2 Sample_3 Sample_4 Sample_5 Sample_6
C1438 119 1516 137 1729 1288 140
C369 267 1152 195 1518 1016 190
C595 258 1805 230 1885 1775 221
C397 258 1805 230 1885 1775 221
C30 119 1516 137 1729 1288 140
C1699 119 1516 137 1729 1288 140
> head(assayData(avaSetExample)$variantRevCount)
Sample_1 Sample_2 Sample_3 Sample_4 Sample_5 Sample_6
C1438 0 0 0 0 0 0
C369 0 0 0 11 0 0
C595 0 0 0 0 0 0
C397 0 0 0 0 0 0
C30 0 6 0 0 0 0
C1699 0 0 0 0 0 0
> head(assayData(avaSetExample)$totalRevCount)
Sample_1 Sample_2 Sample_3 Sample_4 Sample_5 Sample_6
C1438 162 2020 188 2270 1488 159
C369 192 1586 192 1934 1198 137
C595 172 2169 239 2160 2127 160
C397 172 2169 239 2160 2127 160
C30 162 2020 188 2270 1488 159
C1699 162 2020 188 2270 1488 159
> head(fData(avaSetExample))
name canonicalPattern referenceSeqID start end variantBase
C1438 303:T/C s(303,C) I37 303 303 C
C369 309:T/C s(309,C) I36 309 309 C
C595 108:T/C s(108,C) I40 108 108 C
C397 246:A/G s(246,G) I40 246 246 G
C30 225:A/G s(225,G) I37 225 225 G
C1699 28:T/C s(28,C) I37 28 28 C
referenceBases
C1438 T
C369 T
C595 T
C397 A
C30 A
C1699 T
> pData(avaSetExample)
SampleID MID1 MID2 PTP_AccNum Lane ReadGroup
Sample_1 I9646 Mid3 Mid3 GGSFDBH 07 ReadGrp_7
Sample_2 I116 Mid1 Mid1 GA0582C 01 ReadGrp_1
Sample_3 I9644 Mid1 Mid1 GGSFDBH 07 ReadGrp_7
Sample_4 I118 Mid3 Mid3 GA0582C 01 ReadGrp_1
Sample_5 I117 Mid2 Mid2 GA0582C 01 ReadGrp_1
Sample_6 I9645 Mid2 Mid2 GGSFDBH 07 ReadGrp_7
Annotation
Sample_1 Run #006 - PTP 731232 - 05MAY2010
Sample_2 -
Sample_3 Run #006 - PTP 731232 - 05MAY2010
Sample_4 -
Sample_5 -
Sample_6 Run #006 - PTP 731232 - 05MAY2010
> assayDataAmp(avaSetExample)
$forwCount
Sample_1 Sample_2 Sample_3 Sample_4 Sample_5 Sample_6
TET2_E11.04 119 1516 137 1729 1288 140
TET2_E06 248 400 224 478 339 204
TET2_E11.03 267 1152 195 1518 1016 190
TET2_E04 258 1805 230 1885 1775 221
$revCount
Sample_1 Sample_2 Sample_3 Sample_4 Sample_5 Sample_6
TET2_E11.04 162 2020 188 2270 1488 159
TET2_E06 236 2094 255 2171 1624 181
TET2_E11.03 192 1586 192 1934 1198 137
TET2_E04 172 2169 239 2160 2127 160
> fDataAmp(avaSetExample)
ampID primer1 primer2
TET2_E11.04 I90 CATTCACCTTCTCACATAATCCA GAATTGACCCATGAGTTGGAG
TET2_E06 I81 TGCAAGTGACCCTTGTTTTG AACCAAAGATTGGGCTTTCC
TET2_E11.03 I89 GCTCAGTCTACCACCCATCC AGATGCAGGGCATGAAGAGA
TET2_E04 I79 GGGGTTAAGCTTTGTGGATG TTGTGACTCTCTGGTGAATAGCA
referenceSeqID targetEnd targetStart
TET2_E11.04 I37 325 24
TET2_E06 I42 321 21
TET2_E11.03 I36 319 21
TET2_E04 I40 322 21
> referenceSequences(avaSetExample)
class: AlignedRead
length: 4 reads; width: 339..346 cycles
chromosome: NA NA NA NA
position: 1 1 1 1
strand: NA NA NA NA
alignQuality: NumericQuality
alignData varLabels: name refSeqID gene
>
> # Use these commands to export a project from within the AVA-CLI (doAmplicon):
> # > list sample -outputFile sample.csv
> # > list amplicon -outputFile amp.csv
> # > list reference -outputFile reference.csv
> # > list variant -outputFile variant.csv
> # > report variantHits -outputFile variantHits.csv
>
> # Load an AVA dataset containing 6 samples, 4 amplicons and 222 variants
> # by specifying five files, that were exported with the AVA-CLI:
> projectDir = system.file("extdata", "AVASet_doAmplicon", package="R453Plus1Toolbox")
> avaSetExample = AVASet(dirname=projectDir, file_sample="sample.csv", file_amp="amp.csv", file_reference="reference.csv", file_variant="variant.csv", file_variantHits="variantHits.csv")
Reading sample data ... done
Reading reference sequences ... done
Reading variant data ... done
Reading amplicon data ... done
There were 24 warnings (use warnings() to see them)
>
>
>
>
>
> dev.off()
null device
1
>