Last data update: 2014.03.03

R: Custom Gene Set Collection Index
buildCustomIdxEZIDR Documentation

Custom Gene Set Collection Index

Description

It creates gene set collections from a given list of gene sets to be used for the EGSEA analysis.

Usage

buildCustomIdxEZID(entrezIDs, gsets, anno = NULL, label = "custom",
  name = "Custom", species = "Human", min.size = 1)

Arguments

entrezIDs

character, a vector that stores the Entrez Gene IDs tagged in your dataset. The order of the Entrez Gene IDs should match those of the count/expression matrix row names.

gsets

list, list of gene sets. Each gene set is character vector of Enterz IDs. The names of the list should match the GeneSet column in the anno argument (if it is provided).

anno

list, dataframe that stores a detailed annotation for each gene set. Some of its fields can be ID, GeneSet, PubMed, URLs, etc. The GeneSet field is mandatory and should have the same names as the gsets' names.

label

character,a unique id that identifies the collection of gene sets

name

character,the collection name to be used in the EGSEA report

species

character, determine the organism of selected gene sets: "human", "mouse" or "rat".

min.size

integer, the minium number of genes required in a testing gene set

Details

It indexes newly created gene sets and attach gene set annotation if provided.

Value

indexed gene set annotation that can be used with other functions in the package. Each annotation is a list of seven elements: original stores the original gene sets, idx stores the indexed gene sets, anno that stores detailed annotation for each gene set, label a unique id that identifies the collection of gene sets, featureIDs stores the entrezIDs used in building the annotation, species stores that organism name of gene sets and name stores the collection name to be used in the EGSEA report.

Examples

library(EGSEAdata) 
data(il13.data)
v = il13.data$voom
kegg = buildIdxEZID(entrezIDs=rownames(v$E), species="human", 
msigdb.gsets="none", 
         kegg.updated=FALSE, kegg.exclude = c("Metabolism"))
gsets = kegg$kegg$original[1:50]
gs.annots = buildCustomIdxEZID(entrezIDs=rownames(v$E), gsets= gsets, 
species="human")
names(gs.annots)


Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(EGSEA)
Loading required package: Biobase
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: 'BiocGenerics'

The following objects are masked from 'package:parallel':

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from 'package:stats':

    IQR, mad, xtabs

The following objects are masked from 'package:base':

    Filter, Find, Map, Position, Reduce, anyDuplicated, append,
    as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
    get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
    match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
    rbind, rownames, sapply, setdiff, sort, table, tapply, union,
    unique, unsplit

Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.

Loading required package: gage
Loading required package: AnnotationDbi
Loading required package: stats4
Loading required package: IRanges
Loading required package: S4Vectors

Attaching package: 'S4Vectors'

The following objects are masked from 'package:base':

    colMeans, colSums, expand.grid, rowMeans, rowSums

Loading required package: topGO
Loading required package: graph
Loading required package: GO.db

Loading required package: SparseM

Attaching package: 'SparseM'

The following object is masked from 'package:base':

    backsolve


groupGOTerms: 	GOBPTerm, GOMFTerm, GOCCTerm environments built.

Attaching package: 'topGO'

The following object is masked from 'package:IRanges':

    members

The following object is masked from 'package:gage':

    geneData

Loading required package: pathview
Loading required package: org.Hs.eg.db

##############################################################################
Pathview is an open source software package distributed under GNU General
Public License version 3 (GPLv3). Details of GPLv3 is available at
http://www.gnu.org/licenses/gpl-3.0.html. Particullary, users are required to
formally cite the original Pathview paper (not just mention it) in publications
or products. For details, do citation("pathview") within R.

The pathview downloads and uses KEGG data. Non-academic uses may require a KEGG
license agreement (details at http://www.kegg.jp/kegg/legal.html).
##############################################################################

KEGG.db contains mappings based on older data because the original
  resource was removed from the the public domain before the most
  recent update was produced. This package should now be considered
  deprecated and future versions of Bioconductor may not have it
  available.  Users who want more current data are encouraged to look
  at the KEGGREST or reactome.db packages





> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/EGSEA/buildCustomIdxEZID.Rd_%03d_medium.png", width=480, height=480)
> ### Name: buildCustomIdxEZID
> ### Title: Custom Gene Set Collection Index
> ### Aliases: buildCustomIdxEZID
> 
> ### ** Examples
> 
> library(EGSEAdata) 
> data(il13.data)
> v = il13.data$voom
> kegg = buildIdxEZID(entrezIDs=rownames(v$E), species="human", 
+ msigdb.gsets="none", 
+          kegg.updated=FALSE, kegg.exclude = c("Metabolism"))
[1] "Building KEGG pathways annotation object ... "
> gsets = kegg$kegg$original[1:50]
> gs.annots = buildCustomIdxEZID(entrezIDs=rownames(v$E), gsets= gsets, 
+ species="human")
[1] "Building custom pathways annotation object ... "
> names(gs.annots)
[1] "original"   "idx"        "anno"       "label"      "featureIDs"
[6] "species"    "name"      
> 
> 
> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>