Last data update: 2014.03.03

R: Creating an AVASet
AVASetR Documentation

Creating an AVASet

Description

This function imports a project of Roche's Amplicon Variant Analyzer (AVA) Software. It stores all information into an extended version of the Biobase eSet.

Usage

AVASet(dirname, avaBin, file_sample, file_amp, file_reference, file_variant, file_variantHits)

Arguments

dirname

The path of the AVA project.
Without AVA-CLI (AVA version < 2.6): A directory that contains the files and subdirectories "Amplicons/ProjectDef/ampliconsProject.txt", "Amplicons/Results/Variants/currentVariantDefs.txt", "Amplicons/Results/Variants", "Amplicons/Results/Align".
Using AVA-CLI (recommended): Path usually ends with directory "projectfolder"

avaBin

The directory containing the AVA-CLI binary doAmplicon (usually "bin" in the AVA installation directory)

file_sample

Sample information exported with the AVA-CLI. File has to be in CSV format.

file_amp

Amplicons exported with the AVA-CLI. File has to be in CSV format.

file_reference

Reference sequences exported with the AVA-CLI. File has to be in CSV format.

file_variant

Variant information exported with the AVA-CLI. File has to be in CSV format.

file_variantHits

Report of variant hits exported with the AVA-CLI. File has to be in CSV format.

Details

The five arguments for AVA command line interface (AVA-CLI) exports are optional and useful for exported projects, when no AVA software is installed. For exporting, start the AVA-CLI with the command "doAmplicon" and use the commands "open", then "list sample", "list amplicon", "list reference", "list variant" and "report variantHits". See AVASet-class for more details.
Giving only a project directory and the path to the AVA-CLI binary doAmplicon, AVASet will import all information by accessing the AVA-CLI from within R.

An AVASet object consists of three slots to store data about
1. variants

variantForwCount/variantRevCount:

Data frames that contain the number of reads with the respective variant in forward/reverse direction.

totalForwCount/totalRevCount:

Data frames that contain the total coverage for every variant location in forward/reverse direction.

referenceSeq:

Gives the identifier of the reference sequence.

variantBase/referenceBases:

The bases changed in each variant.

start/end:

The position of the variant on the reference sequence.

canonicalPattern/name:

Short identifiers of a variant including the position and the bases changed.

2. amplicons

forwCount/revCount:

Data frames that contain the number of reads for every amplicon and each sample in forward/reverse direction.

primer1,primer2:

The primer sequences for every amplicon.

referenceSeqID:

The identifier of the reference sequence.

targetStart/targetEnd:

The coordinates of the target region.

3. reference sequences

If additional information has been loaded from Ensembl via alignShortReads, this slot knows about the chromosome, position and the strand of each reference sequence.

The structure of the variant and amplicon data is derived from the Biobase eSet and thus separated into assayData, phenoData and featureData. All information about the reference sequences is stored into an object of class AlignedRead.

The phenoData of the variants lists the sample-IDs and name, annotation and group of the read data for all samples. If available, the pico titer plate(s) (PTP) or MID(s) of each sample are shown as well (using the AVA-CLI, PTPs and MIDs cannot be importet at the moment).

Value

An instance of the AVASet class.

Note

It is recommended to use the import via AVA-CLI access. Although deprecated, the import for projects created with older version of the AVA software (< v2.6) is still possible.

Author(s)

Christoph Bartenhagen

See Also

AVASet-class, MapperSet-class, alignShortReads

Examples

# Loading a project from AVA version < 2.6:
# Load an AVA dataset containing 6 samples, 4 amplicons and 259 variants
data(avaSetExample)
avaSetExample

# Loading exported data, that was exported via AVA-CLI
# Load an AVA dataset containing 6 samples, 4 amplicons and 222 variants
# by specifying each file exported from the AVA-CLI
projectDir = system.file("extdata", "AVASet_doAmplicon", package="R453Plus1Toolbox")
avaSetExample = AVASet(dirname=projectDir, file_sample="sample.csv", file_amp="amp.csv", file_reference="reference.csv", file_variant="variant.csv", file_variantHits="variantHits.csv")
avaSetExample

# In case AVA software is installed:
# Saying, for example, the AVA software was installed to the directory "/home/User/AVA",
# the easiest way to import a project via AVA-CLI would look like:
# avaSetExample = AVASet(dirname="myProjectDir", avaBin="/home/User/AVA/bin")


Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(R453Plus1Toolbox)
Loading required package: VariantAnnotation
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: 'BiocGenerics'

The following objects are masked from 'package:parallel':

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from 'package:stats':

    IQR, mad, xtabs

The following objects are masked from 'package:base':

    Filter, Find, Map, Position, Reduce, anyDuplicated, append,
    as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
    get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
    match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
    rbind, rownames, sapply, setdiff, sort, table, tapply, union,
    unique, unsplit

Loading required package: GenomeInfoDb
Loading required package: stats4
Loading required package: S4Vectors

Attaching package: 'S4Vectors'

The following objects are masked from 'package:base':

    colMeans, colSums, expand.grid, rowMeans, rowSums

Loading required package: IRanges
Loading required package: GenomicRanges
Loading required package: SummarizedExperiment
Loading required package: Biobase
Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.

Loading required package: Rsamtools
Loading required package: Biostrings
Loading required package: XVector

Attaching package: 'VariantAnnotation'

The following object is masked from 'package:base':

    tabulate

> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/R453Plus1Toolbox/AVASet.Rd_%03d_medium.png", width=480, height=480)
> ### Name: AVASet
> ### Title: Creating an AVASet
> ### Aliases: AVASet
> ###   AVASet,character,missing,missing,missing,missing,missing,missing-method
> ###   AVASet,character,character,missing,missing,missing,missing,missing-method
> ###   AVASet,character,missing,character,character,character,character,character-method
> ###   AVASet,character,missing,character,character,character,missing,missing-method
> 
> ### ** Examples
> 
> # Loading a project from AVA version < 2.6:
> # Load an AVA dataset containing 6 samples, 4 amplicons and 259 variants
> data(avaSetExample)
> avaSetExample
Variants: 
AVASet (storageMode: list)
assayData: 259 features, 6 samples 
  element names: variantForwCount, totalForwCount, variantRevCount, totalRevCount 
protocolData: none
phenoData
  sampleNames: Sample_1 Sample_2 ... Sample_6 (6 total)
  varLabels: SampleID MID1 ... Annotation (7 total)
  varMetadata: labelDescription
featureData
  featureNames: C1438 C369 ... C763 (259 total)
  fvarLabels: name canonicalPattern ... referenceBases (7 total)
  fvarMetadata: labelDescription
experimentData: use 'experimentData(object)'
Annotation:  

Amplicons: 
assayDataAmp:4 features,  6 samples
  element names:forwCountrevCount
featureDataAmp: 
An object of class 'AnnotatedDataFrame'
  rowNames: TET2_E11.04 TET2_E06 TET2_E11.03 TET2_E04
  varLabels: ampID primer1 ... targetStart (6 total)
  varMetadata: labelDescription

Reference sequences: 
class: AlignedRead
length: 4 reads; width: 339..346 cycles
chromosome: NA NA NA NA 
position: 1 1 1 1 
strand: NA NA NA NA 
alignQuality: NumericQuality 
alignData varLabels: name refSeqID gene 
> 
> # Loading exported data, that was exported via AVA-CLI
> # Load an AVA dataset containing 6 samples, 4 amplicons and 222 variants
> # by specifying each file exported from the AVA-CLI
> projectDir = system.file("extdata", "AVASet_doAmplicon", package="R453Plus1Toolbox")
> avaSetExample = AVASet(dirname=projectDir, file_sample="sample.csv", file_amp="amp.csv", file_reference="reference.csv", file_variant="variant.csv", file_variantHits="variantHits.csv")
Reading sample data ... done
Reading reference sequences ... done
Reading variant data ... done
Reading amplicon data ... done
There were 24 warnings (use warnings() to see them)
> avaSetExample
Variants: 
AVASet (storageMode: list)
assayData: 222 features, 6 samples 
  element names: variantForwCount, totalForwCount, variantRevCount, totalRevCount 
protocolData: none
phenoData
  sampleNames: sample1 sample2 ... sample6 (6 total)
  varLabels: SampleID MID1 ... Annotation (7 total)
  varMetadata: labelDescription
featureData
  featureNames: C1 C2 ... C222 (222 total)
  fvarLabels: name canonicalPattern ... referenceBases (7 total)
  fvarMetadata: labelDescription
experimentData: use 'experimentData(object)'
Annotation:  

Amplicons: 
assayDataAmp:4 features,  6 samples
  element names:forwCountrevCount
featureDataAmp: 
An object of class 'AnnotatedDataFrame'
  rowNames: amp1 amp2 amp3 amp4
  varLabels: ampID primer1 ... targetEnd (6 total)
  varMetadata: labelDescription

Reference sequences: 
class: AlignedRead
length: 4 reads; width: 339..346 cycles
chromosome: NA NA NA NA 
position: 1 1 1 1 
strand: NA NA NA NA 
alignQuality: NumericQuality 
alignData varLabels: name refSeqID gene 
> 
> # In case AVA software is installed:
> # Saying, for example, the AVA software was installed to the directory "/home/User/AVA",
> # the easiest way to import a project via AVA-CLI would look like:
> # avaSetExample = AVASet(dirname="myProjectDir", avaBin="/home/User/AVA/bin")
> 
> 
> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>