Last data update: 2014.03.03

R: Function to find the common genes between two datasets or a...
geneid.mapR Documentation

Function to find the common genes between two datasets or a dataset and a gene list

Description

This function allows for fast mapping between two datasets or a dataset and a gene list. The mapping process is performed using Entrez Gene id as reference. In case of ambiguities (several probes representing the same gene), the most variant probe is selected.

Usage

geneid.map(geneid1, data1, geneid2, data2, verbose = FALSE)

Arguments

geneid1

first vector of Entrez Gene ids. The name of the vector cells must be the name of the probes in the dataset data1.

data1

First dataset with samples in rows and probes in columns. The dimnames must be properly defined.

geneid2

Second vector of Entrez Gene ids. The name of the vector cells must be the name of the probes in the dataset data1 if it is not missing, proper names must be assigned otherwise.

data2

First dataset with samples in rows and probes in columns. The dimnames must be properly defined. It may be missing.

verbose

TRUE to print informative messages, FALSE otherwise.

Value

geneid1

Mapped gene list from geneid1.

data1

Mapped dataset from data1.

geneid2

Mapped gene list from geneid2.

data2

Mapped dataset from data2.

Note

It is mandatory that the names of geneid1 and geneid2 must be the probe names of the microarray platform.

Author(s)

Benjamin Haibe-Kains

Examples

## load NKI data
data(nkis)
nkis.gid <- annot.nkis[ ,"EntrezGene.ID"]
names(nkis.gid) <- dimnames(annot.nkis)[[1]]
## load GGI signature
data(sig.ggi)
ggi.gid <- sig.ggi[ ,"EntrezGene.ID"]
names(ggi.gid) <- as.character(sig.ggi[ ,"probe"])
## mapping through Entrez Gene ids of NKI and GGI signature
res <- geneid.map(geneid1=nkis.gid, data1=data.nkis,
  geneid2=ggi.gid, verbose=FALSE)
str(res)

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(genefu)
Loading required package: survcomp
Loading required package: survival
Loading required package: prodlim
Loading required package: mclust
Package 'mclust' version 5.2
Type 'citation("mclust")' for citing this R package in publications.
Loading required package: limma
Loading required package: biomaRt
Loading required package: iC10
Loading required package: pamr
Loading required package: cluster
Loading required package: iC10TrainingData
Loading required package: AIMS
Loading required package: e1071
Loading required package: Biobase
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: 'BiocGenerics'

The following objects are masked from 'package:parallel':

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following object is masked from 'package:limma':

    plotMA

The following objects are masked from 'package:stats':

    IQR, mad, xtabs

The following objects are masked from 'package:base':

    Filter, Find, Map, Position, Reduce, anyDuplicated, append,
    as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
    get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
    match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
    rbind, rownames, sapply, setdiff, sort, table, tapply, union,
    unique, unsplit

Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.

> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/genefu/geneid.map.Rd_%03d_medium.png", width=480, height=480)
> ### Name: geneid.map
> ### Title: Function to find the common genes between two datasets or a
> ###   dataset and a gene list
> ### Aliases: geneid.map
> ### Keywords: mapping
> 
> ### ** Examples
> 
> ## load NKI data
> data(nkis)
> nkis.gid <- annot.nkis[ ,"EntrezGene.ID"]
> names(nkis.gid) <- dimnames(annot.nkis)[[1]]
> ## load GGI signature
> data(sig.ggi)
> ggi.gid <- sig.ggi[ ,"EntrezGene.ID"]
> names(ggi.gid) <- as.character(sig.ggi[ ,"probe"])
> ## mapping through Entrez Gene ids of NKI and GGI signature
> res <- geneid.map(geneid1=nkis.gid, data1=data.nkis,
+   geneid2=ggi.gid, verbose=FALSE)
> str(res)
List of 4
 $ geneid1: Named chr [1:54] "10212" "4605" "332" "4171" ...
  ..- attr(*, "names")= chr [1:54] "NM_005804" "NM_002466" "NM_001168" "NM_004526" ...
 $ data1  : num [1:150, 1:54] -0.078 0.321 -0.068 -0.282 -0.178 -0.157 -0.017 0.263 -0.07 -0.156 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : chr [1:150] "NKI_123" "NKI_327" "NKI_291" "NKI_370" ...
  .. ..$ : chr [1:54] "NM_005804" "NM_002466" "NM_001168" "NM_004526" ...
 $ geneid2: Named chr [1:54] "10212" "4605" "332" "4171" ...
  ..- attr(*, "names")= chr [1:54] "201584_s_at" "201710_at" "202094_at" "202107_s_at" ...
 $ data2  : NULL
> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>