an DGEGLM fitted model object produced by glmFit or glmQLFit. Rows should correspond to exons.
coef
integer indicating which coefficient of the generalized linear model is to be tested for differential exon usage. Defaults to the last coefficient.
contrast
numeric vector specifying the contrast of the linear model coefficients to be tested for differential exon usage. Length must equal to the number of columns of design. If specified, then takes precedence over coef.
geneid
gene identifiers. Either a vector of length nrow(glmfit) or the name of the column of glmfit$genes containing the gene identifiers. Rows with the same ID are assumed to belong to the same gene.
exonid
exon identifiers. Either a vector of length nrow(glmfit) or the name of the column of glmfit$genes containing the exon identifiers.
prior.count
average prior count to be added to observation to shrink the estimated log-fold-changes towards zero.
verbose
logical, if TRUE some diagnostic information about the number of genes and exons is output.
Details
This function tests for differential exon usage for each gene for a given coefficient of the generalized linear model.
Testing for differential exon usage is equivalent to testing whether the exons in each gene have the same log-fold-changes as the other exons in the same gene.
At exon-level, the log-fold-change of each exon is compared to the log-fold-change of the entire gene which contains that exon.
At gene-level, two different tests are provided. One is converting exon-level p-values to gene-level p-values by the Simes method.
The other is using exon-level test statistics to conduct gene-level tests.
Value
diffSpliceDGE produces an object of class DGELRT containing the component design from glmfit plus the following new components:
comparison
character string describing the coefficient being tested.
coefficients
numeric vector of coefficients on the natural log scale. Each coefficient is the difference between the log-fold-change for that exon versus the log-fold-change for the entire gene which contains that exon.
genes
data.frame of exon annotation.
genecolname
character string giving the name of the column of genes containing gene IDs.
exoncolname
character string giving the name of the column of genes containing exon IDs.
exon.df.test
numeric vector of testing degrees of freedom for exons.
exon.p.value
numeric vector of p-values for exons.
gene.df.test
numeric vector of testing degrees of freedom for genes.
gene.p.value
numeric vector of gene-level testing p-values.
gene.Simes.p.value
numeric vector of Simes' p-values for genes.
gene.genes
data.frame of gene annotation.
Some components of the output depend on whether glmfit is produced by glmFit or glmQLFit.
If glmfit is produced by glmFit, then the following components are returned in the output object:
exon.LR
numeric vector of LR-statistics for exons.
gene.LR
numeric vector of LR-statistics for gene-level test.
If glmfit is produced by glmQLFit, then the following components are returned in the output object:
exon.F
numeric vector of F-statistics for exons.
gene.df.prior
numeric vector of prior degrees of freedom for genes.
gene.df.residual
numeric vector of residual degrees of freedom for genes.
gene.F
numeric vector of F-statistics for gene-level test.
The information and testing results for both exons and genes are sorted by geneid and by exonid within gene.
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(edgeR)
Loading required package: limma
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/edgeR/diffSpliceDGE.Rd_%03d_medium.png", width=480, height=480)
> ### Name: diffSpliceDGE
> ### Title: Test for Differential Exon Usage
> ### Aliases: diffSpliceDGE
>
> ### ** Examples
>
> # Gene exon annotation
> Gene <- paste("Gene", 1:100, sep="")
> Gene <- rep(Gene, each=10)
> Exon <- paste("Ex", 1:10, sep="")
> Gene.Exon <- paste(Gene, Exon, sep=".")
> genes <- data.frame(GeneID=Gene, Gene.Exon=Gene.Exon)
>
> group <- factor(rep(1:2, each=3))
> design <- model.matrix(~group)
> mu <- matrix(100, nrow=1000, ncol=6)
> # knock-out the first exon of Gene1 by 90%
> mu[1,4:6] <- 10
> # generate exon counts
> counts <- matrix(rnbinom(6000,mu=mu,size=20),1000,6)
>
> y <- DGEList(counts=counts, lib.size=rep(1e6,6), genes=genes)
> gfit <- glmFit(y, design, dispersion=0.05)
>
> ds <- diffSpliceDGE(gfit, geneid="GeneID")
Total number of exons: 1000
Total number of genes: 100
Number of genes with 1 exon: 0
Mean number of exons in a gene: 10
Max number of exons in a gene: 10
> topSpliceDGE(ds)
GeneID NExons P.Value FDR
10 Gene1 10 6.532418e-20 6.532418e-18
450 Gene45 10 1.570785e-02 7.785296e-01
90 Gene9 10 2.760528e-02 7.785296e-01
920 Gene92 10 3.824468e-02 7.785296e-01
610 Gene61 10 3.892648e-02 7.785296e-01
670 Gene67 10 5.134177e-02 8.555098e-01
140 Gene14 10 6.704470e-02 8.555098e-01
40 Gene4 10 7.523250e-02 8.555098e-01
650 Gene65 10 7.699588e-02 8.555098e-01
210 Gene21 10 1.120963e-01 9.515910e-01
> plotSpliceDGE(ds)
>
>
>
>
>
> dev.off()
null device
1
>