R: Function to find the common genes between two datasets or a...
geneid.map
R Documentation
Function to find the common genes between two datasets or a dataset and a gene list
Description
This function allows for fast mapping between two datasets or a dataset and a gene list. The mapping process is performed using Entrez Gene id as reference. In case of ambiguities (several probes representing the same gene), the most variant probe is selected.
first vector of Entrez Gene ids. The name of the vector cells must be the name of the probes in the dataset data1.
data1
First dataset with samples in rows and probes in columns. The dimnames must be properly defined.
geneid2
Second vector of Entrez Gene ids. The name of the vector cells must be the name of the probes in the dataset data1 if it is not missing, proper names must be assigned otherwise.
data2
First dataset with samples in rows and probes in columns. The dimnames must be properly defined. It may be missing.
verbose
TRUE to print informative messages, FALSE otherwise.
Value
geneid1
Mapped gene list from geneid1.
data1
Mapped dataset from data1.
geneid2
Mapped gene list from geneid2.
data2
Mapped dataset from data2.
Note
It is mandatory that the names of geneid1 and geneid2 must be the probe names of the microarray platform.
Author(s)
Benjamin Haibe-Kains
Examples
## load NKI data
data(nkis)
nkis.gid <- annot.nkis[ ,"EntrezGene.ID"]
names(nkis.gid) <- dimnames(annot.nkis)[[1]]
## load GGI signature
data(sig.ggi)
ggi.gid <- sig.ggi[ ,"EntrezGene.ID"]
names(ggi.gid) <- as.character(sig.ggi[ ,"probe"])
## mapping through Entrez Gene ids of NKI and GGI signature
res <- geneid.map(geneid1=nkis.gid, data1=data.nkis,
geneid2=ggi.gid, verbose=FALSE)
str(res)
Results
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(genefu)
Loading required package: survcomp
Loading required package: survival
Loading required package: prodlim
Loading required package: mclust
Package 'mclust' version 5.2
Type 'citation("mclust")' for citing this R package in publications.
Loading required package: limma
Loading required package: biomaRt
Loading required package: iC10
Loading required package: pamr
Loading required package: cluster
Loading required package: iC10TrainingData
Loading required package: AIMS
Loading required package: e1071
Loading required package: Biobase
Loading required package: BiocGenerics
Loading required package: parallel
Attaching package: 'BiocGenerics'
The following objects are masked from 'package:parallel':
clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
clusterExport, clusterMap, parApply, parCapply, parLapply,
parLapplyLB, parRapply, parSapply, parSapplyLB
The following object is masked from 'package:limma':
plotMA
The following objects are masked from 'package:stats':
IQR, mad, xtabs
The following objects are masked from 'package:base':
Filter, Find, Map, Position, Reduce, anyDuplicated, append,
as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
rbind, rownames, sapply, setdiff, sort, table, tapply, union,
unique, unsplit
Welcome to Bioconductor
Vignettes contain introductory material; view with
'browseVignettes()'. To cite Bioconductor, see
'citation("Biobase")', and for packages 'citation("pkgname")'.
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/genefu/geneid.map.Rd_%03d_medium.png", width=480, height=480)
> ### Name: geneid.map
> ### Title: Function to find the common genes between two datasets or a
> ### dataset and a gene list
> ### Aliases: geneid.map
> ### Keywords: mapping
>
> ### ** Examples
>
> ## load NKI data
> data(nkis)
> nkis.gid <- annot.nkis[ ,"EntrezGene.ID"]
> names(nkis.gid) <- dimnames(annot.nkis)[[1]]
> ## load GGI signature
> data(sig.ggi)
> ggi.gid <- sig.ggi[ ,"EntrezGene.ID"]
> names(ggi.gid) <- as.character(sig.ggi[ ,"probe"])
> ## mapping through Entrez Gene ids of NKI and GGI signature
> res <- geneid.map(geneid1=nkis.gid, data1=data.nkis,
+ geneid2=ggi.gid, verbose=FALSE)
> str(res)
List of 4
$ geneid1: Named chr [1:54] "10212" "4605" "332" "4171" ...
..- attr(*, "names")= chr [1:54] "NM_005804" "NM_002466" "NM_001168" "NM_004526" ...
$ data1 : num [1:150, 1:54] -0.078 0.321 -0.068 -0.282 -0.178 -0.157 -0.017 0.263 -0.07 -0.156 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : chr [1:150] "NKI_123" "NKI_327" "NKI_291" "NKI_370" ...
.. ..$ : chr [1:54] "NM_005804" "NM_002466" "NM_001168" "NM_004526" ...
$ geneid2: Named chr [1:54] "10212" "4605" "332" "4171" ...
..- attr(*, "names")= chr [1:54] "201584_s_at" "201710_at" "202094_at" "202107_s_at" ...
$ data2 : NULL
>
>
>
>
>
> dev.off()
null device
1
>