Last data update: 2014.03.03

R: Get mapped Entrez Gene IDs from CpG probe names
getMappedEntrezIDsR Documentation

Get mapped Entrez Gene IDs from CpG probe names

Description

Given a set of CpG probe names and optionally all the CpG sites tested, this function outputs a list containing the mapped Entrez Gene IDs as well as the numbers of probes per gene, and a vector indicating significance.

Usage

getMappedEntrezIDs(sig.cpg, all.cpg = NULL)

Arguments

sig.cpg

character vector of significant CpG sites used for testing gene set enrichment

all.cpg

character vector of all CpG sites tested. Defaults to all CpG sites on the array.

Details

This function is used by the gene set testing functions gometh and gsameth. It maps the significant CpG probe names to Entrez Gene IDs, as well as all the CpG sites tested. It also calculated the numbers of probes for gene.

Genes associated with each CpG site are obtained from the annotation package IlluminaHumanMethylation450kanno.ilmn12.hg19.

Value

A list with the following elements

sig.eg

mapped Entrez Gene IDs for the significant probes

universe

mapped Entrez Gene IDs for all probes on the array, or for all the CpG probes tested.

freq

table output with numbers of probes associated with each gene

de

a vector of ones and zeroes of the same length of universe indicating which genes in the universe are significantly differentially methylated.

Author(s)

Belinda Phipson

See Also

gometh,gsameth

Examples

library(IlluminaHumanMethylation450kanno.ilmn12.hg19)
library(org.Hs.eg.db)
library(limma)
ann <- getAnnotation(IlluminaHumanMethylation450kanno.ilmn12.hg19)

# Randomly select 1000 CpGs to be significantly differentially methylated
sigcpgs <- sample(rownames(ann),1000,replace=FALSE)

# All CpG sites tested
allcpgs <- rownames(ann)

mappedEz <- getMappedEntrezIDs(sigcpgs,allcpgs)
mappedEz$sig.eg[1:10]
mappedEz$universe[1:10]
mappedEz$freq[1:10]
mappedEz$de[1:10]

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(missMethyl)
Setting options('download.file.method.GEOquery'='auto')
Setting options('GEOquery.inmemory.gpl'=FALSE)

> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/missMethyl/getMappedEntrezIDs.Rd_%03d_medium.png", width=480, height=480)
> ### Name: getMappedEntrezIDs
> ### Title: Get mapped Entrez Gene IDs from CpG probe names
> ### Aliases: getMappedEntrezIDs
> 
> ### ** Examples
> 
> library(IlluminaHumanMethylation450kanno.ilmn12.hg19)
Loading required package: minfi
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: 'BiocGenerics'

The following objects are masked from 'package:parallel':

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from 'package:stats':

    IQR, mad, xtabs

The following objects are masked from 'package:base':

    Filter, Find, Map, Position, Reduce, anyDuplicated, append,
    as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
    get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
    match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
    rbind, rownames, sapply, setdiff, sort, table, tapply, union,
    unique, unsplit

Loading required package: Biobase
Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.

Loading required package: lattice
Loading required package: GenomicRanges
Loading required package: S4Vectors
Loading required package: stats4

Attaching package: 'S4Vectors'

The following objects are masked from 'package:base':

    colMeans, colSums, expand.grid, rowMeans, rowSums

Loading required package: IRanges
Loading required package: GenomeInfoDb
Loading required package: SummarizedExperiment
Loading required package: Biostrings
Loading required package: XVector
Loading required package: bumphunter
Loading required package: foreach
Loading required package: iterators
Loading required package: locfit
locfit 1.5-9.1 	 2013-03-22
> library(org.Hs.eg.db)
Loading required package: AnnotationDbi
> library(limma)

Attaching package: 'limma'

The following object is masked from 'package:BiocGenerics':

    plotMA

> ann <- getAnnotation(IlluminaHumanMethylation450kanno.ilmn12.hg19)
> 
> # Randomly select 1000 CpGs to be significantly differentially methylated
> sigcpgs <- sample(rownames(ann),1000,replace=FALSE)
> 
> # All CpG sites tested
> allcpgs <- rownames(ann)
> 
> mappedEz <- getMappedEntrezIDs(sigcpgs,allcpgs)
Warning message:
In alias2SymbolTable(flat$symbol) :
  Multiple symbols ignored for one or more aliases
> mappedEz$sig.eg[1:10]
 [1] "10005"     "1001"      "100101467" "100124536" "100126329" "100126339"
 [7] "100126352" "100129550" "100131205" "100302192"
> mappedEz$universe[1:10]
 [1] "1"         "10"        "100"       "1000"      "10000"     "100008586"
 [7] "10001"     "10002"     "10003"     "100033413"
> mappedEz$freq[1:10]
eg.all
        1        10       100      1000     10000 100008586     10001     10002 
       16         8         9        22        32         3        10        16 
    10003 100033413 
       14         7 
> mappedEz$de[1:10]
 [1] 0 0 0 0 0 0 0 0 0 0
> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>