Last data update: 2014.03.03

R: Identifying Differentially Methylated Regions (DMRs)
findDMRR Documentation

Identifying Differentially Methylated Regions (DMRs)

Description

Identifying differentially methylated regions for pairwise or multiple samples comparision.

Usage

## S4 method for signature 'methylPipe,BSdataSet'
findDMR(object, Nproc=NULL, ROI=NULL,
pmdGRanges=NULL, MCClass='mCG', dmrSize=10, dmrBp=1000, binsize=0,
eprop=0.3, coverage=1, Pvalue=NULL, SNPs=NULL)

Arguments

object

An object of class BSdataSet

Nproc

numeric; the number of processors to use, one chromosome is ran for each processor

ROI

character; either NULL or an object of class GRanges consisting of genomic regions of interest for which DMRs are identified

pmdGRanges

a GRanges object containing the genomic coordinates of Partially Methylated Domains that will be masked

MCClass

character; the mC sequence context to be considered, one of all, mCG, mCHG or mCHH

dmrSize

numeric; the number of consecutive mC to be simulataneously considered; atleast 5

dmrBp

numeric; the max number of bp containing the dmrSize mC

binsize

numeric; the size of the bin used for smoothing the methylation levels, useful for nonCG methylation in human

eprop

numeric; the max - min methylation level is determined for each mC, or for each bin, and only mC (or bins) with difference greater than eprop are considered

coverage

numeric; the minimum number of total reads at a given cytosine genomic position

Pvalue

numeric; to select only those mC with significant p-value

SNPs

GRanges; if SNPs information is provided those cytosine are removed from DMR computation

Details

Typically for nonCG methylation in human a dmrSize of 50, a dmrBp of 50000 and a binsize of 1000 are used. For CpG methylation in human and both CpG and nonCG methylation in plants the default settings are usually fine. Partially Methylated Domains are usually found in differentiated cells and can constitute up to one third of the genome (Lister R et al, Nature 2009). Usually DMRs are not selected within those regions especially when comparing differentiated and pluripotent cells. Eprop is used to speed up the analysis. According to the number of samples different test are used to compare the methylation levels (percentage of methylated reads for each mC). In case of two samples the non parametric wilcoxon test is used. In case of more than two samples the kruskal wallis non parametric testis used. Check consolidateDMRs to further process and finalize the list of DMRs.

Value

A GRanges object of DMRs with the metadata slots for pValue, MethDiff_Perc and log2Enrichment. When two samples are compared, MethDiff_Perc is the diference between percentage methylation between the conditions compared. However, log2Enrichment is the log2ratio between the mean for the samples.

Author(s)

Mattia Pelizzola, Kamal Kishore

See Also

consolidateDMRs

Examples

require(BSgenome.Hsapiens.UCSC.hg18)
uncov_GR <- GRanges(Rle('chr20'), IRanges(c(14350,69251,84185), c(18349,73250,88184)))
H1data <- system.file('extdata', 'H1_chr20_CG_10k_tabix_out.txt.gz', package='methylPipe')
H1.db <- BSdata(file=H1data, uncov=uncov_GR, org=Hsapiens)
IMR90data <- system.file('extdata', 'IMR90_chr20_CG_10k_tabix_out.txt.gz', package='methylPipe')
IMR90.db <- BSdata(file=IMR90data, uncov=uncov_GR, org=Hsapiens)
H1.IMR90.set <- BSdataSet(org=Hsapiens, group=c("C","E"), IMR90=IMR90.db, H1=H1.db)
gr_file <- system.file('extdata', 'GR_chr20.Rdata', package='methylPipe')
load(gr_file)
DMRs <- findDMR(object= H1.IMR90.set, Nproc=1, ROI=GR_chr20, MCClass='mCG',
dmrSize=10, dmrBp=1000, eprop=0.3)
head(DMRs)

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(methylPipe)
Loading required package: GenomicRanges
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: 'BiocGenerics'

The following objects are masked from 'package:parallel':

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from 'package:stats':

    IQR, mad, xtabs

The following objects are masked from 'package:base':

    Filter, Find, Map, Position, Reduce, anyDuplicated, append,
    as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
    get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
    match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
    rbind, rownames, sapply, setdiff, sort, table, tapply, union,
    unique, unsplit

Loading required package: S4Vectors
Loading required package: stats4

Attaching package: 'S4Vectors'

The following objects are masked from 'package:base':

    colMeans, colSums, expand.grid, rowMeans, rowSums

Loading required package: IRanges
Loading required package: GenomeInfoDb
Loading required package: SummarizedExperiment
Loading required package: Biobase
Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.

Loading required package: Rsamtools
Loading required package: Biostrings
Loading required package: XVector
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/methylPipe/findDMR.Rd_%03d_medium.png", width=480, height=480)
> ### Name: findDMR
> ### Title: Identifying Differentially Methylated Regions (DMRs)
> ### Aliases: findDMR findDMR,methylPipe,BSdataSet
> ###   findDMR,methylPipe,BSdataSet-method findDMR-methods
> ###   findDMR,BSdataSet-method
> 
> ### ** Examples
> 
> require(BSgenome.Hsapiens.UCSC.hg18)
Loading required package: BSgenome.Hsapiens.UCSC.hg18
Loading required package: BSgenome
Loading required package: rtracklayer
> uncov_GR <- GRanges(Rle('chr20'), IRanges(c(14350,69251,84185), c(18349,73250,88184)))
> H1data <- system.file('extdata', 'H1_chr20_CG_10k_tabix_out.txt.gz', package='methylPipe')
> H1.db <- BSdata(file=H1data, uncov=uncov_GR, org=Hsapiens)
> IMR90data <- system.file('extdata', 'IMR90_chr20_CG_10k_tabix_out.txt.gz', package='methylPipe')
> IMR90.db <- BSdata(file=IMR90data, uncov=uncov_GR, org=Hsapiens)
> H1.IMR90.set <- BSdataSet(org=Hsapiens, group=c("C","E"), IMR90=IMR90.db, H1=H1.db)
> gr_file <- system.file('extdata', 'GR_chr20.Rdata', package='methylPipe')
> load(gr_file)
> DMRs <- findDMR(object= H1.IMR90.set, Nproc=1, ROI=GR_chr20, MCClass='mCG',
+ dmrSize=10, dmrBp=1000, eprop=0.3)
> head(DMRs)
GRanges object with 6 ranges and 3 metadata columns:
      seqnames         ranges strand |    pValue MethDiff_Perc log2Enrichment
         <Rle>      <IRanges>  <Rle> | <numeric>     <numeric>      <numeric>
  [1]    chr20 [14404, 15096]      * |     0.006         41.58          1.373
  [2]    chr20 [15059, 15275]      * |     0.013         33.03          0.812
  [3]    chr20 [15766, 16396]      * |     0.032         30.78          0.804
  [4]    chr20 [72059, 73012]      * |     0.006         51.89           1.99
  [5]    chr20 [84876, 85872]      * |     0.006         41.41           0.97
  [6]    chr20 [85760, 85937]      * |     0.011         26.92          0.513
  -------
  seqinfo: 1 sequence from an unspecified genome; no seqlengths
> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>