R Graphical Manual

Browse All

Last data update: 2014.03.03

R: Function to find the common genes between two datasets or a...

geneid.map

R Documentation

Function to find the common genes between two datasets or a dataset and a gene list

Description

This function allows for fast mapping between two datasets or a dataset and a gene list. The mapping process is performed using Entrez Gene id as reference. In case of ambiguities (several probes representing the same gene), the most variant probe is selected.

Usage

geneid.map(geneid1, data1, geneid2, data2, verbose = FALSE)

Arguments

`geneid1`	first vector of Entrez Gene ids. The name of the vector cells must be the name of the probes in the dataset `data1`.
`data1`	First dataset with samples in rows and probes in columns. The dimnames must be properly defined.
`geneid2`	Second vector of Entrez Gene ids. The name of the vector cells must be the name of the probes in the dataset `data1` if it is not missing, proper names must be assigned otherwise.
`data2`	First dataset with samples in rows and probes in columns. The dimnames must be properly defined. It may be missing.
`verbose`	`TRUE` to print informative messages, `FALSE` otherwise.

Value

`geneid1`	Mapped gene list from `geneid1`.
`data1`	Mapped dataset from `data1`.
`geneid2`	Mapped gene list from `geneid2`.
`data2`	Mapped dataset from `data2`.

Note

It is mandatory that the names of geneid1 and geneid2 must be the probe names of the microarray platform.

Author(s)

Benjamin Haibe-Kains

Examples

## load NKI data
data(nkis)
nkis.gid <- annot.nkis[ ,"EntrezGene.ID"]
names(nkis.gid) <- dimnames(annot.nkis)[[1]]
## load GGI signature
data(sig.ggi)
ggi.gid <- sig.ggi[ ,"EntrezGene.ID"]
names(ggi.gid) <- as.character(sig.ggi[ ,"probe"])
## mapping through Entrez Gene ids of NKI and GGI signature
res <- geneid.map(geneid1=nkis.gid, data1=data.nkis,
  geneid2=ggi.gid, verbose=FALSE)
str(res)

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(genefu)
Loading required package: survcomp
Loading required package: survival
Loading required package: prodlim
Loading required package: mclust
Package 'mclust' version 5.2
Type 'citation("mclust")' for citing this R package in publications.
Loading required package: limma
Loading required package: biomaRt
Loading required package: iC10
Loading required package: pamr
Loading required package: cluster
Loading required package: iC10TrainingData
Loading required package: AIMS
Loading required package: e1071
Loading required package: Biobase
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: 'BiocGenerics'

The following objects are masked from 'package:parallel':

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following object is masked from 'package:limma':

    plotMA

The following objects are masked from 'package:stats':

    IQR, mad, xtabs

The following objects are masked from 'package:base':

    Filter, Find, Map, Position, Reduce, anyDuplicated, append,
    as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
    get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
    match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
    rbind, rownames, sapply, setdiff, sort, table, tapply, union,
    unique, unsplit

Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.

> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/genefu/geneid.map.Rd_%03d_medium.png", width=480, height=480)
> ### Name: geneid.map
> ### Title: Function to find the common genes between two datasets or a
> ###   dataset and a gene list
> ### Aliases: geneid.map
> ### Keywords: mapping
> 
> ### ** Examples
> 
> ## load NKI data
> data(nkis)
> nkis.gid <- annot.nkis[ ,"EntrezGene.ID"]
> names(nkis.gid) <- dimnames(annot.nkis)[[1]]
> ## load GGI signature
> data(sig.ggi)
> ggi.gid <- sig.ggi[ ,"EntrezGene.ID"]
> names(ggi.gid) <- as.character(sig.ggi[ ,"probe"])
> ## mapping through Entrez Gene ids of NKI and GGI signature
> res <- geneid.map(geneid1=nkis.gid, data1=data.nkis,
+   geneid2=ggi.gid, verbose=FALSE)
> str(res)
List of 4
 $ geneid1: Named chr [1:54] "10212" "4605" "332" "4171" ...
  ..- attr(*, "names")= chr [1:54] "NM_005804" "NM_002466" "NM_001168" "NM_004526" ...
 $ data1  : num [1:150, 1:54] -0.078 0.321 -0.068 -0.282 -0.178 -0.157 -0.017 0.263 -0.07 -0.156 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : chr [1:150] "NKI_123" "NKI_327" "NKI_291" "NKI_370" ...
  .. ..$ : chr [1:54] "NM_005804" "NM_002466" "NM_001168" "NM_004526" ...
 $ geneid2: Named chr [1:54] "10212" "4605" "332" "4171" ...
  ..- attr(*, "names")= chr [1:54] "201584_s_at" "201710_at" "202094_at" "202107_s_at" ...
 $ data2  : NULL
> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>