R: Summarize Probe Sets Associated with a hyperGTest Result
probeSetSummary
R Documentation
Summarize Probe Sets Associated with a hyperGTest Result
Description
Given the result of a hyperGTest run (an instance of
GOHyperGResult), this function lists all Probe Set IDs
associated with the selected Entrez IDs annotated at each
significant GO term in the test result.
A GOHyperGResult instance. This is the output
of the hyperGTest function when testing the GO category.
pvalue
Optional p-value cutoff. Only results for GO terms
with a p-value less than the specified value will be returned.
If omitted, pvalueCutoff(result) is used.
categorySize
Optional minimum size (number of annotations)
for the GO terms. Only results for GO terms with
categorySize or more annotations will be returned. If
omitted, no category size criteria will be used.
sigProbesets
Optional vector of probeset IDs. See details for
more information.
ids
Character. The type of IDs used in creating the
GOHyperGResult object. Usually 'ENTREZID', but may be e.g.,
'ACCNUM' if using A. thaliana chip.
Details
Usually the goal of doing a Fisher's exact test on a set of
significant probesets is to find pathways or cellular activities that
are being perturbed in an experiment. After doing the test, one
usually gets a list of significant GO terms, and the next logical step
might be to determine which probesets contributed to the significance
of a certain term.
Because the input for the Fisher's exact test consists of a vector of
unique Entrez Gene IDs, and there may be multiple probesets that
interrogate a particular transcript, the ouput for this function lists
all of the probesets that map to each Entrez Gene ID, along with an
indicator that shows which of the probesets were used as input.
The rationale for this is that one might not be able to assume a given
probeset actually interrogates the intended transcript, so it might be
useful to be able to check to see what other similar probesets are
doing.
Because one of the first steps before running hyperGTest is to
subset the input vectors of geneIds and universeGeneIds, any
information about probeset IDs that interrogate the same gene
transcript is lost. In order to recover this information, one can pass
a vector of probeset IDs that were considered significant. This vector
will then be used to indicate which of the probesets that map to a
given GO term were significant in the original analysis.
Value
A list of data.frame. Each element of the list
corresponds to one of the GO terms (the term is provides as the name
of the element). Each data.frame has three columns:
the Entrez Gene ID (EntrezID), the probe set ID
(ProbeSetID), and a 0/1 indicator of whether the probe set ID
was provided as part of the initial input (selected)
Note that this 0/1 indicator will only be correct if the 'geneId'
vector used to construct the GOHyperGParams object was a named
vector (where the names are probeset IDs), or if a vector of
'sigProbesets' was passed to this function.
Author(s)
S. Falcon and J. MacDonald
Examples
## Fake up some data
library("hgu95av2.db")
library("annotate")
prbs <- ls(hgu95av2GO)[1:300]
## Only those with GO ids
hasGO <- sapply(mget(prbs, hgu95av2GO), function(ids)
if(!is.na(ids) && length(ids) > 1) TRUE else FALSE)
prbs <- prbs[hasGO]
prbs <- getEG(prbs, "hgu95av2")
## remove duplicates, but keep named vector
prbs <- prbs[!duplicated(prbs)]
## do the same for universe
univ <- ls(hgu95av2GO)[1:5000]
hasUnivGO <- sapply(mget(univ, hgu95av2GO), function(ids)
if(!is.na(ids) && length(ids) > 1) TRUE else FALSE)
univ <- univ[hasUnivGO]
univ <- unique(getEG(univ, "hgu95av2"))
p <- new("GOHyperGParams", geneIds=prbs, universeGeneIds=univ,
ontology="BP", annotation="hgu95av2", conditional=TRUE)
## this part takes time...
if(interactive()){
hyp <- hyperGTest(p)
ps <- probeSetSummary(hyp, 0.05, 10)
}
Results
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(GOstats)
Loading required package: Biobase
Loading required package: BiocGenerics
Loading required package: parallel
Attaching package: 'BiocGenerics'
The following objects are masked from 'package:parallel':
clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
clusterExport, clusterMap, parApply, parCapply, parLapply,
parLapplyLB, parRapply, parSapply, parSapplyLB
The following objects are masked from 'package:stats':
IQR, mad, xtabs
The following objects are masked from 'package:base':
Filter, Find, Map, Position, Reduce, anyDuplicated, append,
as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
rbind, rownames, sapply, setdiff, sort, table, tapply, union,
unique, unsplit
Welcome to Bioconductor
Vignettes contain introductory material; view with
'browseVignettes()'. To cite Bioconductor, see
'citation("Biobase")', and for packages 'citation("pkgname")'.
Loading required package: Category
Loading required package: stats4
Loading required package: AnnotationDbi
Loading required package: IRanges
Loading required package: S4Vectors
Attaching package: 'S4Vectors'
The following objects are masked from 'package:base':
colMeans, colSums, expand.grid, rowMeans, rowSums
Loading required package: Matrix
Attaching package: 'Matrix'
The following object is masked from 'package:S4Vectors':
expand
Loading required package: graph
Attaching package: 'GOstats'
The following object is masked from 'package:AnnotationDbi':
makeGOGraph
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/GOstats/probeSetSummary.Rd_%03d_medium.png", width=480, height=480)
> ### Name: probeSetSummary
> ### Title: Summarize Probe Sets Associated with a hyperGTest Result
> ### Aliases: probeSetSummary
> ### Keywords: manip htest
>
> ### ** Examples
>
> ## Fake up some data
> library("hgu95av2.db")
Loading required package: org.Hs.eg.db
> library("annotate")
Loading required package: XML
Attaching package: 'XML'
The following object is masked from 'package:graph':
addNode
> prbs <- ls(hgu95av2GO)[1:300]
> ## Only those with GO ids
> hasGO <- sapply(mget(prbs, hgu95av2GO), function(ids)
+ if(!is.na(ids) && length(ids) > 1) TRUE else FALSE)
> prbs <- prbs[hasGO]
> prbs <- getEG(prbs, "hgu95av2")
> ## remove duplicates, but keep named vector
> prbs <- prbs[!duplicated(prbs)]
> ## do the same for universe
> univ <- ls(hgu95av2GO)[1:5000]
> hasUnivGO <- sapply(mget(univ, hgu95av2GO), function(ids)
+ if(!is.na(ids) && length(ids) > 1) TRUE else FALSE)
> univ <- univ[hasUnivGO]
> univ <- unique(getEG(univ, "hgu95av2"))
>
> p <- new("GOHyperGParams", geneIds=prbs, universeGeneIds=univ,
+ ontology="BP", annotation="hgu95av2", conditional=TRUE)
> ## this part takes time...
> # if(interactive()){
> hyp <- hyperGTest(p)
> ps <- probeSetSummary(hyp, 0.05, 10)
> # }
>
>
>
>
>
> dev.off()
null device
1
>