R Graphical Manual

Browse All

Last data update: 2014.03.03

R: Apply an offset on the read start along the transcript and...

countShiftReads

R Documentation

Apply an offset on the read start along the transcript and returns the coverage on the 5pUTR, CDS, 3pUTR, as well as a matrix of codon coverage per ORF.

Description

Apply an offset on the read start along the transcript and returns the coverage on the 5pUTR, CDS, 3pUTR, as well as a matrix of codon coverage per ORF.

Usage

countShiftReads(exonGRanges, cdsPosTransc, alnGRanges, shiftValue, motifSize)

Arguments

`exonGRanges`	a GRangesList. It contains the exon coordinates grouped by transcript.
`cdsPosTransc`	a list. It contains the relative positions of the start and end of the ORFs. The transcript names in exonGRanges and cdsPosTransc should be the same.
`alnGRanges`	A GRanges object containing the alignment information. In order to improve the performance the GAlignments BAM object should be transformed into a GRanges object with cigar match size metadata.
`shiftValue`	integer. The offset for recalibrating reads on transcripts when computing coverage. The default value for this parameter is 0, no offset should be performed.
`motifSize`	an integer. The number of nucleotides in each motif on which to compute coverage and usage. Default 3 nucleotides (codon).

Value

a list with 2 objects. The first object in the list is a data.frame containing: information on ORFs (names, chromosomal position, length) as well as the counts on the 5pUTR, CDS and 3pUTR once the offset is applied. The second object in the list is a list in itself. It contains for each ORF in the cdsPosTransc, for each codon the sum of read starts covering the 3 codon nucleotides. This per codon coverage does not contain information on the codon type, just its position in the ORF and its coverage.

Examples

#read the BAM file into a GAlignments object using
#GenomicAlignments::readGAlignments
#the GAlignments object should be similar to ctrlGAlignments
data(ctrlGAlignments)
aln <- ctrlGAlignments

#transform the GAlignments object into a GRanges object (faster processing)
alnGRanges <- readsToStartOrEnd(aln, what="start")

#make a txdb object containing the annotations for the specified species.
#In this case hg19.
txdb <- TxDb.Hsapiens.UCSC.hg19.knownGene::TxDb.Hsapiens.UCSC.hg19.knownGene
#Please make sure that seqnames of txdb correspond to
#the seqnames of the alignment files ("chr" particle)
#if not rename the txdb seqlevels
#renameSeqlevels(txdb, sub("chr", "", seqlevels(txdb)))

#get all CDSs by transcript
cds <- GenomicFeatures::cdsBy(txdb, by="tx", use.names=TRUE)

#get all exons by transcript
exonGRanges <- GenomicFeatures::exonsBy(txdb, by="tx", use.names=TRUE)
#get the per transcript relative position of start and end codons
cdsPosTransc <- orfRelativePos(cds, exonGRanges)
#compute the counts on the different features after applying
#the specified shift value on the read start along the transcript
countsData <- countShiftReads(exonGRanges[names(cdsPosTransc)], cdsPosTransc,
           alnGRanges, -14)

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(RiboProfiling)
Loading required package: Biostrings
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: 'BiocGenerics'

The following objects are masked from 'package:parallel':

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from 'package:stats':

    IQR, mad, xtabs

The following objects are masked from 'package:base':

    Filter, Find, Map, Position, Reduce, anyDuplicated, append,
    as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
    get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
    match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
    rbind, rownames, sapply, setdiff, sort, table, tapply, union,
    unique, unsplit

Loading required package: S4Vectors
Loading required package: stats4

Attaching package: 'S4Vectors'

The following objects are masked from 'package:base':

    colMeans, colSums, expand.grid, rowMeans, rowSums

Loading required package: IRanges
Loading required package: XVector
Warning messages:
1: replacing previous import 'BiocGenerics::Position' by 'ggplot2::Position' when loading 'RiboProfiling' 
2: replacing previous import 'ggplot2::Position' by 'BiocGenerics::Position' when loading 'ggbio' 
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/RiboProfiling/countShiftReads.Rd_%03d_medium.png", width=480, height=480)
> ### Name: countShiftReads
> ### Title: Apply an offset on the read start along the transcript and
> ###   returns the coverage on the 5pUTR, CDS, 3pUTR, as well as a matrix of
> ###   codon coverage per ORF.
> ### Aliases: countShiftReads
> 
> ### ** Examples
> 
> #read the BAM file into a GAlignments object using
> #GenomicAlignments::readGAlignments
> #the GAlignments object should be similar to ctrlGAlignments
> data(ctrlGAlignments)
> aln <- ctrlGAlignments
> 
> #transform the GAlignments object into a GRanges object (faster processing)
> alnGRanges <- readsToStartOrEnd(aln, what="start")
> 
> #make a txdb object containing the annotations for the specified species.
> #In this case hg19.
> txdb <- TxDb.Hsapiens.UCSC.hg19.knownGene::TxDb.Hsapiens.UCSC.hg19.knownGene
> #Please make sure that seqnames of txdb correspond to
> #the seqnames of the alignment files ("chr" particle)
> #if not rename the txdb seqlevels
> #renameSeqlevels(txdb, sub("chr", "", seqlevels(txdb)))
> 
> #get all CDSs by transcript
> cds <- GenomicFeatures::cdsBy(txdb, by="tx", use.names=TRUE)
> 
> #get all exons by transcript
> exonGRanges <- GenomicFeatures::exonsBy(txdb, by="tx", use.names=TRUE)
> #get the per transcript relative position of start and end codons
> cdsPosTransc <- orfRelativePos(cds, exonGRanges)
> #compute the counts on the different features after applying
> #the specified shift value on the read start along the transcript
> countsData <- countShiftReads(exonGRanges[names(cdsPosTransc)], cdsPosTransc,
+            alnGRanges, -14)
Warning message:
In countShiftReads(exonGRanges[names(cdsPosTransc)], cdsPosTransc,  :
  Param motifSize should be an integer! Accepted values 3, 6 or 9. Default value is 3.

> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>