Last data update: 2014.03.03

R: Calculate gene expression or relative within gene expression
getGeneExpressionR Documentation

Calculate gene expression or relative within gene expression

Description

Calculate either gene expression or relative within gene expression using transcript expression samples and transcript information file.

Usage

getGeneExpression(sampleFile, outFile=NULL, trInfo=NULL, trInfoFile=NULL,
      pretend=FALSE)
getWithinGeneExpression(sampleFile, outFile=NULL, trInfo=NULL, trInfoFile=NULL,
      pretend=FALSE, keepOrder=FALSE)

Arguments

sampleFile

File containing the transcript expression samples.

outFile

Name of the output file. If not used, function uses temporary file.

trInfo

DataFrame containing transcript information. Either trInfo or trInfoFile argument has to be provided. Otherwise function tries file with same name as sampleFile and extension tr.

trInfoFile

Transcript information file. Either trInfo or trInfoFile argument has to be provided. Otherwise function tries file with same name as sampleFile and extension tr.

pretend

Do not execute, only print out command line calls for the C++ version of the program.

keepOrder

If TRUE then transcripts will always keep same order, otherwise transcripts might be grouped by genes in the output. (The order is always same if transcripts are grouped by genes.)

Details

The getGeneExpression function takes samples of transcript expression and produces file with expression of genes by adding up transcript expression.

The getWithinGeneExpression function takes samples of transcript expression and produces file with relative within gene expression samples for each transcript.

Both function need valid transcript information which contains gene transcript mapping. This can be provided either via DataFrame trInfo or file named trInfoFile.

In case of a file, it should be formatted in following manner. The first line should contain "# M <numberOfTranscripts>" and the following numberOfTranscripts lines have to contain "<geneName> <transcriptName> <transcriptLength>". Example is provided in extdata/ensSelect1.tr. Please note that the transcript information file automatically generated from alignment files are not sufficient because SAM/BAM files do not include gene names. We hope to provide more convenient way in future versions of BitSeq.

Value

Name of file containing the new expression samples.

Author(s)

Peter Glaus

See Also

getExpression, tri.load, tri.file.setGeneNames, tri.file.hasGeneNames

Examples

setwd(system.file("extdata",package="BitSeq"))
## use transcript information as object
trinfo <- tri.load("ensSelect1.tr")
## gene expression
getGeneExpression("data-c0b1.rpkm", "data-c0b1-GE.rpkm", trInfo=trinfo)
gExpSamples <- loadSamples("data-c0b1-GE.rpkm")
gExpMeans <- rowMeans(as.data.frame(gExpSamples))
gExpMeans

## within gene expression
wgeFN <- getWithinGeneExpression("data-c0b1.rpkm", trInfoFile="ensSelect1.tr")
wgExpSamples <- loadSamples(wgeFN)
wgExpMeans <- rowMeans(as.data.frame(wgExpSamples))
head(wgExpMeans)

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(BitSeq)
Loading required package: Rsamtools
Loading required package: GenomeInfoDb
Loading required package: stats4
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: 'BiocGenerics'

The following objects are masked from 'package:parallel':

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from 'package:stats':

    IQR, mad, xtabs

The following objects are masked from 'package:base':

    Filter, Find, Map, Position, Reduce, anyDuplicated, append,
    as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
    get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
    match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
    rbind, rownames, sapply, setdiff, sort, table, tapply, union,
    unique, unsplit

Loading required package: S4Vectors

Attaching package: 'S4Vectors'

The following objects are masked from 'package:base':

    colMeans, colSums, expand.grid, rowMeans, rowSums

Loading required package: IRanges
Loading required package: GenomicRanges
Loading required package: Biostrings
Loading required package: XVector
Loading required package: zlibbioc
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/BitSeq/getGeneExpression.Rd_%03d_medium.png", width=480, height=480)
> ### Name: getGeneExpression
> ### Title: Calculate gene expression or relative within gene expression
> ### Aliases: getGeneExpression getWithinGeneExpression
> ### Keywords: gene expression
> 
> ### ** Examples
> 
> setwd(system.file("extdata",package="BitSeq"))
> ## use transcript information as object
> trinfo <- tri.load("ensSelect1.tr")
> ## gene expression
> getGeneExpression("data-c0b1.rpkm", "data-c0b1-GE.rpkm", trInfo=trinfo)
[1] "data-c0b1-GE.rpkm"
> gExpSamples <- loadSamples("data-c0b1-GE.rpkm")
> gExpMeans <- rowMeans(as.data.frame(gExpSamples))
> gExpMeans
[1] 428064.3254   6493.3615 174039.8659    607.9465  96557.9185
> 
> ## within gene expression
> wgeFN <- getWithinGeneExpression("data-c0b1.rpkm", trInfoFile="ensSelect1.tr")
> wgExpSamples <- loadSamples(wgeFN)
> wgExpMeans <- rowMeans(as.data.frame(wgExpSamples))
> head(wgExpMeans)
[1] 0.04536688 0.08511396 0.05179466 0.47166168 0.14001189 0.00524071
> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>