Function performs a pathway enrichment analysis of a definied set of genes.
Usage
pathEnrich(geneList, geneSets, universe=NULL)
Arguments
geneList
vector of gene names to be used for pathway enrichment
geneSets
"GeneSetColletion" object with functional pathways gene sets
universe
number of genes that were probed in the initial experiment
Details
geneSets is a "GeneSetColletion" object containing gene sets from various databases. Different sources for gene sets data are allowed and have to be provided in Gene Matrix Transposed file format (*.gmt), where each gene set is described by a pathway name, a description, and the genes in the gene set. Two examples are shown to demonstrate how to define geneSets object. See examples.
The variable universe represents a total number of genes that were probed in the initial experiment, e.g. the number of all genes on a microarray. If universe is not definied, universe is equal to the number of all genes that can be mapped to any pathways in chosen database.
Value
A data.frame with following columns:
pathway
names of enriched pathways
description
gene set description (e.g. a link to the named gene set in MSigDB)
genes_in_pathway
total number of known genes in the pathway
%_match
number of matched genes refered to the total number of known genes in the pathway given in %
pValue
p-value
adj.pValue
Benjamini-Hochberg adjucted p-value
overlap
genes from input genes list that overlap with all known genes in the pathway
Additionally an .txt file containing all above information is created.
Author(s)
Agata Michna
References
Subramanian, A., Tamayo, P., Mootha, V. K., Mukherjee, S., Ebert, B. L., Gillette, M. A., Paulovich, A., Pomeroy, S. L., Golub, T. R., Lander, E. S. and Mesirov, J. P. (2005). Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles.
PNAS 102(43), 15545-15550.
## Not run:
## Example 1 - using gene sets from the Molecular Signatures Database (MSigDB collections)
## Download .gmt file 'c2.all.v5.0.symbols.gmt' (all curated gene sets, gene symbols)
## from the Broad, http://www.broad.mit.edu/gsea/downloads.jsp#msigdb, then
geneSets <- getGmt("/path/to/c2.all.v5.0.symbols.gmt")
## load "eSetObject" containing simulated time-course data
data(TCsimData)
## check for differentially expressed genes
diffExprs <- splineDiffExprs(eSetObject = TCsimData, df = 3, cutoff.adj.pVal = 0.01, reference = "T1")
## use differentially expressed genes for pathway enrichment analysis
enrichPath <- pathEnrich(geneList = rownames(diffExprs), geneSets = geneSets, universe = 6536)
## End(Not run)
## Not run:
## Example 2 - using gene sets from the Reactome Pathway Database
## Download and unzip .gmt.zip file 'ReactomePathways.gmt.zip'
## ("Reactome Pathways Gene Set" under "Specialized data formats") from the Reactome website
## http://www.reactome.org/pages/download-data/, then
geneSets <- getGmt("/path/to/ReactomePathways.gmt")
data(TCsimData)
diffExprs <- splineDiffExprs(eSetObject = TCsimData, df = 3, cutoff.adj.pVal = 0.01, reference = "T1")
enrichPath <- pathEnrich(geneList = rownames(diffExprs), geneSets = geneSets, universe = 6536)
## End(Not run)
## Small example with gene sets consist of KEGG pathways only
geneSets <- getGmt(system.file("extdata", "c2.cp.kegg.v5.0.symbols.gmt", package="splineTimeR"))
data(TCsimData)
diffExprs <- splineDiffExprs(eSetObject = TCsimData, df = 3, cutoff.adj.pVal = 0.01, reference = "T1")
enrichPath <- pathEnrich(geneList = rownames(diffExprs), geneSets = geneSets, universe = 6536)
Results
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(splineTimeR)
Loading required package: Biobase
Loading required package: BiocGenerics
Loading required package: parallel
Attaching package: 'BiocGenerics'
The following objects are masked from 'package:parallel':
clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
clusterExport, clusterMap, parApply, parCapply, parLapply,
parLapplyLB, parRapply, parSapply, parSapplyLB
The following objects are masked from 'package:stats':
IQR, mad, xtabs
The following objects are masked from 'package:base':
Filter, Find, Map, Position, Reduce, anyDuplicated, append,
as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
rbind, rownames, sapply, setdiff, sort, table, tapply, union,
unique, unsplit
Welcome to Bioconductor
Vignettes contain introductory material; view with
'browseVignettes()'. To cite Bioconductor, see
'citation("Biobase")', and for packages 'citation("pkgname")'.
Loading required package: igraph
Attaching package: 'igraph'
The following objects are masked from 'package:BiocGenerics':
normalize, union
The following objects are masked from 'package:stats':
decompose, spectrum
The following object is masked from 'package:base':
union
Loading required package: limma
Attaching package: 'limma'
The following object is masked from 'package:BiocGenerics':
plotMA
Loading required package: GSEABase
Loading required package: annotate
Loading required package: AnnotationDbi
Loading required package: stats4
Loading required package: IRanges
Loading required package: S4Vectors
Attaching package: 'S4Vectors'
The following object is masked from 'package:igraph':
compare
The following objects are masked from 'package:base':
colMeans, colSums, expand.grid, rowMeans, rowSums
Attaching package: 'IRanges'
The following object is masked from 'package:igraph':
simplify
Loading required package: XML
Loading required package: graph
Attaching package: 'graph'
The following object is masked from 'package:XML':
addNode
The following objects are masked from 'package:igraph':
degree, edges, intersection
Loading required package: gtools
Attaching package: 'gtools'
The following object is masked from 'package:igraph':
permute
Loading required package: splines
Loading required package: GeneNet
Loading required package: corpcor
Loading required package: longitudinal
Loading required package: fdrtool
Loading required package: FIs
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/splineTimeR/pathEnrich.Rd_%03d_medium.png", width=480, height=480)
> ### Name: pathEnrich
> ### Title: Pathway enrichment analysis
> ### Aliases: pathEnrich
> ### Keywords: gene set enrichment analysis pathway enrichment analysis
>
> ### ** Examples
>
> ## Not run:
> ##D ## Example 1 - using gene sets from the Molecular Signatures Database (MSigDB collections)
> ##D ## Download .gmt file 'c2.all.v5.0.symbols.gmt' (all curated gene sets, gene symbols)
> ##D ## from the Broad, http://www.broad.mit.edu/gsea/downloads.jsp#msigdb, then
> ##D geneSets <- getGmt("/path/to/c2.all.v5.0.symbols.gmt")
> ##D ## load "eSetObject" containing simulated time-course data
> ##D data(TCsimData)
> ##D ## check for differentially expressed genes
> ##D diffExprs <- splineDiffExprs(eSetObject = TCsimData, df = 3, cutoff.adj.pVal = 0.01, reference = "T1")
> ##D ## use differentially expressed genes for pathway enrichment analysis
> ##D enrichPath <- pathEnrich(geneList = rownames(diffExprs), geneSets = geneSets, universe = 6536)
> ## End(Not run)
>
> ## Not run:
> ##D ## Example 2 - using gene sets from the Reactome Pathway Database
> ##D ## Download and unzip .gmt.zip file 'ReactomePathways.gmt.zip'
> ##D ## ("Reactome Pathways Gene Set" under "Specialized data formats") from the Reactome website
> ##D ## http://www.reactome.org/pages/download-data/, then
> ##D geneSets <- getGmt("/path/to/ReactomePathways.gmt")
> ##D data(TCsimData)
> ##D diffExprs <- splineDiffExprs(eSetObject = TCsimData, df = 3, cutoff.adj.pVal = 0.01, reference = "T1")
> ##D enrichPath <- pathEnrich(geneList = rownames(diffExprs), geneSets = geneSets, universe = 6536)
> ## End(Not run)
>
> ## Small example with gene sets consist of KEGG pathways only
> geneSets <- getGmt(system.file("extdata", "c2.cp.kegg.v5.0.symbols.gmt", package="splineTimeR"))
> data(TCsimData)
> diffExprs <- splineDiffExprs(eSetObject = TCsimData, df = 3, cutoff.adj.pVal = 0.01, reference = "T1")
-------------------------------------------------
Differential analysis done for df = 3 and adj.P.Val <= 0.01
Number of differentially expressed genes: 952
> enrichPath <- pathEnrich(geneList = rownames(diffExprs), geneSets = geneSets, universe = 6536)
--------------------------------------------------------
Pathway enrichment done!
--------------------------------------------------------
>
>
>
>
>
> dev.off()
null device
1
>