a list of data.frame objects.
It contains the number of reads per codon in a CDS.
genomeSeq
a BSgenome object.
It contains the full genome sequences for the organism.
orfCoord
a GRangesList.
The coordinates of the ORFs on the genome.
motifSize
an integer. The number of nucleotides in each motif
on which to compute coverage and usage. Default 3 nucleotides (codon).
No motif longer than 6 nucleotides is accepted.
Attention! For long motifs, the function can be quite slow!!
Value
a list of 2 data.frame objects:
one with the number of times each codon type is found in each ORF and
one with the number of reads for each codon type in each ORF.
Examples
#for each codon in each ORF get the read coverage
#parameter listReadsCodon can be returned by the riboSeqFromBam function
#it corresponts to the 2nd element in the list returned by riboSeqFromBam
data(codonIndexCovCtrl)
listReadsCodon <- codonIndexCovCtrl
txdb <- TxDb.Hsapiens.UCSC.hg19.knownGene::TxDb.Hsapiens.UCSC.hg19.knownGene
#get the names of the ORFs
#grouped by transcript
cds <- GenomicFeatures::cdsBy(txdb, use.names=TRUE)
orfCoord <- cds[names(cds) %in% names(listReadsCodon)]
#get the genome, please check that the genome has the same seqlevels
genomeSeq <- BSgenome.Hsapiens.UCSC.hg19::BSgenome.Hsapiens.UCSC.hg19
#if not rename it
#gSeq <- GenomeInfoDb::renameSeqlevels(genomeSeq,
#sub("chr", "", GenomeInfoDb::seqlevels(genomeSeq)))
#codon frequency, coverage, and annotation
codonData <- codonInfo(listReadsCodon, genomeSeq, orfCoord)
Results
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(RiboProfiling)
Loading required package: Biostrings
Loading required package: BiocGenerics
Loading required package: parallel
Attaching package: 'BiocGenerics'
The following objects are masked from 'package:parallel':
clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
clusterExport, clusterMap, parApply, parCapply, parLapply,
parLapplyLB, parRapply, parSapply, parSapplyLB
The following objects are masked from 'package:stats':
IQR, mad, xtabs
The following objects are masked from 'package:base':
Filter, Find, Map, Position, Reduce, anyDuplicated, append,
as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
rbind, rownames, sapply, setdiff, sort, table, tapply, union,
unique, unsplit
Loading required package: S4Vectors
Loading required package: stats4
Attaching package: 'S4Vectors'
The following objects are masked from 'package:base':
colMeans, colSums, expand.grid, rowMeans, rowSums
Loading required package: IRanges
Loading required package: XVector
Warning messages:
1: replacing previous import 'BiocGenerics::Position' by 'ggplot2::Position' when loading 'RiboProfiling'
2: replacing previous import 'ggplot2::Position' by 'BiocGenerics::Position' when loading 'ggbio'
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/RiboProfiling/codonInfo.Rd_%03d_medium.png", width=480, height=480)
> ### Name: codonInfo
> ### Title: Associates the read counts on codons with the codon type for
> ### each ORF.
> ### Aliases: codonInfo
>
> ### ** Examples
>
> #for each codon in each ORF get the read coverage
> #parameter listReadsCodon can be returned by the riboSeqFromBam function
> #it corresponts to the 2nd element in the list returned by riboSeqFromBam
> data(codonIndexCovCtrl)
> listReadsCodon <- codonIndexCovCtrl
>
> txdb <- TxDb.Hsapiens.UCSC.hg19.knownGene::TxDb.Hsapiens.UCSC.hg19.knownGene
>
> #get the names of the ORFs
> #grouped by transcript
> cds <- GenomicFeatures::cdsBy(txdb, use.names=TRUE)
> orfCoord <- cds[names(cds) %in% names(listReadsCodon)]
>
> #get the genome, please check that the genome has the same seqlevels
> genomeSeq <- BSgenome.Hsapiens.UCSC.hg19::BSgenome.Hsapiens.UCSC.hg19
> #if not rename it
> #gSeq <- GenomeInfoDb::renameSeqlevels(genomeSeq,
> #sub("chr", "", GenomeInfoDb::seqlevels(genomeSeq)))
>
> #codon frequency, coverage, and annotation
> codonData <- codonInfo(listReadsCodon, genomeSeq, orfCoord)
Loading required package: DBI
Loading required package: tcltk
Error : .onLoad failed in loadNamespace() for 'tcltk', details:
call: fun(libname, pkgname)
error: Tcl/Tk support is not available on this system
In addition: Warning messages:
1: In codonInfo(listReadsCodon, genomeSeq, orfCoord) :
Param motifSize should be an integer! Accepted values 3, 6 or 9. Default value is 3.
2: S3 methods 'as.character.tclObj', 'as.character.tclVar', 'as.double.tclObj', 'as.integer.tclObj', 'as.logical.tclObj', 'as.raw.tclObj', 'print.tclObj', '[[.tclArray', '[[<-.tclArray', '$.tclArray', '$<-.tclArray', 'names.tclArray', 'names<-.tclArray', 'length.tclArray', 'length<-.tclArray', 'tclObj.tclVar', 'tclObj<-.tclVar', 'tclvalue.default', 'tclvalue.tclObj', 'tclvalue.tclVar', 'tclvalue<-.default', 'tclvalue<-.tclVar', 'close.tkProgressBar' were declared in NAMESPACE but not found
>
>
>
>
>
> dev.off()
null device
1
>