R: Functions to compute similarities between GO graphs and also...
simLL
R Documentation
Functions to compute similarities between GO graphs and also
between Entrez Gene IDs based on their induced GO graphs.
Description
Both simUI and simLP compute a similarity measure
between two GO graphs. For simLL, first the induced GO graph
for each of its arguments is found and then these are passed to one
of simUI or simLP.
A set of evidence codes to be ignored in
constructing the induced GO graphs.
mapfun
A function taking a character vector of Entrez Gene IDs
as its only argument and returning a list of "GO lists" matching the
structure of the lists in the GO maps of annotation data packages.
The function should behave similarly to mget(x, eg2gomap,
ifnotfound=NA), that is, NA should be returned if a
specified
Entrez ID has no GO mapping. See details for the interaction of
mapfun and chip.
chip
The name of a DB-based annotation data package (the name
will end in ".db"). This package will be used to generate an Entrez
ID to GO ID mapping instead of mapfun.
g1
An instance of the graph class.
g2
An instance of the graph class.
Details
For each of ll1 and ll2 the set of most specific GO
terms within the ontology specified (Ontology) that are not
based on any excluded evidence code (dropCodes) are found. The
mapping is achieved in one of three ways:
If mapfun is provided, it will be used to perform the
needed lookups. In this case, chip will be ignored.
If chip is provided and mapfun=NULL, then the
needed lookups will be done based on the Entrez to GO mappings
encapsulated in the specified annotation data package. This is
the recommended usage.
If mapfun and chip are NULL or missing,
then the function will attempt to load the GO package (the
environment-based package, distinct from GO.db). This package
contains a legacy environment mapping Entrez IDs to GO IDs. If
the GO package is not available, an error will be raised.
Omitting both mapfun and chip is not recommended as
it is not compatible with the DB-based annotation data packages.
Next, the induced GO graphs are computed.
Finally these graphs are passed to one of simUI, (union
intersection), or simLP (longest path). For simUI the
distance is the size of the intersection of the node sets divided by
the size of the union of the node sets. Large values indicate more
similarity. These similarities are between 0 and 1.
For simLP the length of the longest path in the intersection
graph of the two supplied graph. Again, large values indicate more
similarity. Similarities are between 0 and the maximum leaf depth of
the graph for the specified ontology.
Value
A list with:
sim
The numeric similarity measure.
measure
Which measure was used.
g1
The graph induced by ll1.
g2
The graph induced by ll2.
If one of the supplied Gene IDs does not have any GO terms associated
with it, in the selected ontology and with the selected evidence codes
then NA is returned.
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(GOstats)
Loading required package: Biobase
Loading required package: BiocGenerics
Loading required package: parallel
Attaching package: 'BiocGenerics'
The following objects are masked from 'package:parallel':
clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
clusterExport, clusterMap, parApply, parCapply, parLapply,
parLapplyLB, parRapply, parSapply, parSapplyLB
The following objects are masked from 'package:stats':
IQR, mad, xtabs
The following objects are masked from 'package:base':
Filter, Find, Map, Position, Reduce, anyDuplicated, append,
as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
rbind, rownames, sapply, setdiff, sort, table, tapply, union,
unique, unsplit
Welcome to Bioconductor
Vignettes contain introductory material; view with
'browseVignettes()'. To cite Bioconductor, see
'citation("Biobase")', and for packages 'citation("pkgname")'.
Loading required package: Category
Loading required package: stats4
Loading required package: AnnotationDbi
Loading required package: IRanges
Loading required package: S4Vectors
Attaching package: 'S4Vectors'
The following objects are masked from 'package:base':
colMeans, colSums, expand.grid, rowMeans, rowSums
Loading required package: Matrix
Attaching package: 'Matrix'
The following object is masked from 'package:S4Vectors':
expand
Loading required package: graph
Attaching package: 'GOstats'
The following object is masked from 'package:AnnotationDbi':
makeGOGraph
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/GOstats/simLL.Rd_%03d_medium.png", width=480, height=480)
> ### Name: simLL
> ### Title: Functions to compute similarities between GO graphs and also
> ### between Entrez Gene IDs based on their induced GO graphs.
> ### Aliases: simLL simUI simLP
> ### Keywords: manip
>
> ### ** Examples
>
> library("hgu95av2.db")
Loading required package: org.Hs.eg.db
> eg1 = c("9184", "3547")
>
> bb = simLL(eg1[1], eg1[2], "BP", chip="hgu95av2.db")
>
>
>
>
>
> dev.off()
null device
1
>