Last data update: 2014.03.03

R: MERGE ILLUMINA PROBE CLASSES
merge_classesR Documentation

MERGE ILLUMINA PROBE CLASSES

Description

Illumina assigns quite specific functional classes to the probes. 11 classes are defined (Bibikova et al. 2009 & 2011):

  • Relation to gene: Body, 5'UTR, 3'UTR, 1stExon, TSS1500, TSS200

  • Relation to CpG island: Island, N_Shelf, N_Shore, S_Shelf, S_Shore

As DMRforPairs looks at probes in close proximity to each other within each class, very specific annotations might result in too few probes per region per class and a subsequent drop in the number of identified regions and/or statistical power. This function therefore allows grouping and / or selection of classes of interest.

Usage

merge_classes(refgene_class,island_class,recode=1,sep=";")

Arguments

refgene_class

see description and classes_gene parameter in DMRforPairs

island_class

see description and classes_island parameter in DMRforPairs

recode

recoding scheme to use for the functional classes. Can be custom scheme (data frame) or build in scheme (0, 1 or 2). See details.

sep

Separator used in the second column of the recode parameter. Use ";" or do not specify if using the build in schemes.

Details

The recode parameter can be set to use one of the build in recoding schemes:

  • 0: analyze all 11 classes annotated by Illumina separately

  • 1: group Body, 5'UTR, 3'UTR into one category "gene" and TSS1500 and TSS200 in another ("tss"). All island associated classes are merged in one class "island"

  • 2: all probes without subdivision into classes (also included probes associated with no class).

The recode parameter can also be set to a custom recoding scheme (data.frame). For example:

data.frame(c("gene","tss"),c("Body;5'UTR;3'UTR;1stExon","TSS1500;TSS200")).

In this scheme the classes are merged into two categories: TSS or other gene region. Probes solely associated with CpG island-related classes are discarded. Probes not annotated to any of the 11 classes are always discarded by DMRforPairs, except when option 2 is used which collects all probes into one class (i.e. ignores classes).

If classes are unknown an m x 1 character vector with "unknown.gene" and "unknown.island" for all m rows (probes) can be used for the refgene_class and island_class parameters respectively (set "recode" to 2!).

Value

List of objects containing:

$pclass

original classes per probe (gene and island classes are merged) (m x 1 data frame)

$pclass_merged

classes after recoding (m x 1 data frame)

$no.pclass

row indexes of probes with no annotation to any of the classes specified in the recoding scheme.

$u_pclass

unique list of the classes of interest after recoding (i.e. the first column of the recode data fram)

Author(s)

Martin Rijlaarsdam, m.a.rijlaarsdam@gmail.com

References

  • Bibikova, M., et al., High density DNA methylation array with single CpG site resolution. Genomics, 2011. 98(4): p. 288-95.

  • Bibikova, M., et al., Genome-wide DNA methylation profiling using Infinium(R) assay. Epigenomics, 2009. 1(1): p. 177-200.

Examples

#merge_classes() is an integrated part of the DMRforPairs() wrapper and is 
#not usually called by the user directly. Please see DMRforPairs() for 
#an example.

Results