Last data update: 2014.03.03

R: Filter GO and KEGG database
GO2listR Documentation

Filter GO and KEGG database

Description

Filter GO and KEGG database and transform database to list

Usage

GO2list(dbase, go.cat = NULL, rm = NULL, keep = NULL)
KEGG2list(dbase, rm = NULL, keep = NULL)
GO2offspring(x)
GO2level(x, go.level=-1, relation=c("is_a"))

Arguments

dbase

A datastructure storing identifieres of GO/KEGG terms and assigned genes. Can be one of

database

usually of class ‘ProbeGo3AnnDbBimap’ (as defined in package “AnnotationDbi”)

named list

with keys being the identifiers and values being genes

dataframe

with first column being the identifiers and second column being genes. Additional columns are ignored.

x

a list with keys being the identifiers and values being genes (e.g. output of GO2list)

go.cat

GO category ("MF", "BP", "CC") that should be returned and filtered

go.level

Level in the DAG of GO terms. Defaults to “-1” for pass through without modification. Otherwise: a positive integer giving the level at which GO terms should be grouped together.

rm

remove these terms

keep

keep only these terms

relation

relationships in GO hierarchy that should be considered. Defaults to “is_a”

Details

The settings for “rm” and “keep” can be combined, allowing for efficient reduction of the number of GO terms and KEGG pathways, respectively.

Providing a named list instead of a database can be useful for non-model organisms, where only a draft Blast2GO-annotation is available. In this case, the names of the list are the GO terms (or KEGG pathways) and the content of each list item is a character vector with tag-ids.

The function GO2offspring does the same as the databaseGO2ALLPROBES function does (e.g. hgu133plus2GO2ALLPROBES). I.e. instead of representing only features (probe sets, genes, ...) assigned to the GO terms directly, it also contains all features assigned to all children (offsprings).

The function GO2level groups GO terms together at a more general level to simplify data interpretation and speed up runtime. This function works according to the level option provided by DAVID, but the number of levels is not restricted.

Value

A named list with each slot containing the ids for the term or pathway.

Examples

library(hgu133plus2.db)
x <- GO2list(dbase=hgu133plus2GO2PROBE, go.cat="CC",
	rm=c("GO:0000139", "GO:0000790", "GO:0005730", "GO:0005739"))

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(geecc)
geecc 1.6.0 loaded
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/geecc/GO2list.Rd_%03d_medium.png", width=480, height=480)
> ### Name: GO2list
> ### Title: Filter GO and KEGG database
> ### Aliases: GO2list KEGG2list GO2level GO2offspring
> 
> ### ** Examples
> 
> library(hgu133plus2.db)
Loading required package: AnnotationDbi
Loading required package: stats4
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: 'BiocGenerics'

The following objects are masked from 'package:parallel':

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from 'package:stats':

    IQR, mad, xtabs

The following objects are masked from 'package:base':

    Filter, Find, Map, Position, Reduce, anyDuplicated, append,
    as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
    get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
    match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
    rbind, rownames, sapply, setdiff, sort, table, tapply, union,
    unique, unsplit

Loading required package: Biobase
Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.

Loading required package: IRanges
Loading required package: S4Vectors

Attaching package: 'S4Vectors'

The following objects are masked from 'package:base':

    colMeans, colSums, expand.grid, rowMeans, rowSums

Loading required package: org.Hs.eg.db


> x <- GO2list(dbase=hgu133plus2GO2PROBE, go.cat="CC",
+ 	rm=c("GO:0000139", "GO:0000790", "GO:0005730", "GO:0005739"))
Loading required package: GO.db

> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>