Last data update: 2014.03.03

R: Quality Assessment Rqc function
rqcQAR Documentation

Quality Assessment Rqc function

Description

Process a set of files and returns a list of quality control data. Files must be FASTQ format, compressed or not.

Usage

rqcQA(x, sample = TRUE, n = 1e+06, group = rep("None", length(x)),
  top = 10, pair = seq_along(x), ...)

## S4 method for signature 'list'
rqcQA(x, sample, n, group, top, pair,
  workers = multicoreWorkers())

## S4 method for signature 'character'
rqcQA(x, sample = TRUE, n = 1e+06,
  group = rep("None", length(x)), top = 10, pair = seq_along(x),
  workers = multicoreWorkers())

## S4 method for signature 'BamFile'
rqcQA(x, sample, n, group, top, pair)

## S4 method for signature 'FastqFile'
rqcQA(x, sample, n, group, top, pair)

Arguments

x

input file(s)

sample

It reads a random sample from files if this parameter is TRUE.

n

Number of sequences to read from each input file. This represents sample size if 'sample' parameter is TRUE, if not represents the chunk size to read on each iteration. Default is read a sample of one million sequences from each input file.

group

group name for each input file.

top

number of top over-represented reads. Default is 10 reads.

pair

combination of files for paired-end reads. By default, all input files are treated as single-end. For paired-end, please define a vector of numbers where two index with the same value represent a pair. Examples, single-end c(1,2,3,4) and paired-end c(1,1,2,2).

...

other parameters

workers

number of parallel workers

Details

Input files are read using FastStreamer and FastSampler classes of ShortRead package. Process multiple files in parallel using bplapply function of BiocParallel package.

Value

A named list of RqcResultSet objects, each one represents a file.

Methods (by class)

  • list: process a list of FastqFile and BamFile objects.

  • character: automatically detects file format (using detectFileFormat function) of input files then process.

  • BamFile: process only one BAM file.

  • FastqFile: process only one FASTQ file.

Author(s)

Welliton Souza

See Also

rqc

Examples


checkpoint("Rqc", path=system.file(package="Rqc", "extdata"), {
  folder <- system.file(package="ShortRead", "extdata/E-MTAB-1147")
  files <- list.files(full.names=TRUE, path=folder)
  rqcResultSet <- rqcQA(files, pair=c(1,1), workers=1)
}, keep="rqcResultSet")
rqcReadQualityPlot(rqcResultSet)

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(Rqc)
Loading required package: BiocParallel
Loading required package: ShortRead
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: 'BiocGenerics'

The following objects are masked from 'package:parallel':

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from 'package:stats':

    IQR, mad, xtabs

The following objects are masked from 'package:base':

    Filter, Find, Map, Position, Reduce, anyDuplicated, append,
    as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
    get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
    match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
    rbind, rownames, sapply, setdiff, sort, table, tapply, union,
    unique, unsplit

Loading required package: Biostrings
Loading required package: S4Vectors
Loading required package: stats4

Attaching package: 'S4Vectors'

The following objects are masked from 'package:base':

    colMeans, colSums, expand.grid, rowMeans, rowSums

Loading required package: IRanges
Loading required package: XVector
Loading required package: Rsamtools
Loading required package: GenomeInfoDb
Loading required package: GenomicRanges
Loading required package: GenomicAlignments
Loading required package: SummarizedExperiment
Loading required package: Biobase
Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.

Loading required package: ggplot2
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/Rqc/rqcQA.Rd_%03d_medium.png", width=480, height=480)
> ### Name: rqcQA
> ### Title: Quality Assessment Rqc function
> ### Aliases: rqcQA rqcQA,BamFile-method rqcQA,FastqFile-method
> ###   rqcQA,character-method rqcQA,list-method
> 
> ### ** Examples
> 
> 
> checkpoint("Rqc", path=system.file(package="Rqc", "extdata"), {
+   folder <- system.file(package="ShortRead", "extdata/E-MTAB-1147")
+   files <- list.files(full.names=TRUE, path=folder)
+   rqcResultSet <- rqcQA(files, pair=c(1,1), workers=1)
+ }, keep="rqcResultSet")
/home/ddbj/local/lib64/R/library/Rqc/extdata/Rqc.rda has been loaded.
> rqcReadQualityPlot(rqcResultSet)
> 
> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>