Last data update: 2014.03.03

R: Creates a MEDIPS SET by reading a suitable input file
MEDIPS.createSetR Documentation

Creates a MEDIPS SET by reading a suitable input file

Description

Reads the input file and calculates genome wide short read coverage (counts) at the specified window size. After reading of the input file, the MEDIPS SET contains information about the input file name, the dependent organism, the chromosomes included in the input file, the length of the included chromosomes (automatically loaded), and the number of regions.

Usage

MEDIPS.createSet(file=NULL, extend=0, shift=0, window_size=300, BSgenome=NULL, uniq=1e-3, chr.select=NULL, paired = F, sample_name=NULL, bwa=FALSE)

Arguments

file

Path and file name of the input data

BSgenome

The reference genome name as defined by BSgenome

extend

defines the number of bases by which the region will be extended before the genome vector is calculated. Regions will be extended along the plus or the minus strand as defined by their provided strand information.

shift

As an alternative to the extend parameter, the shift parameter can be specified. Here, the reads are not extended but shifted by the specified number of nucleotides with respect to the given strand infomation. One of the two parameters extend or shift has to be 0.

uniq

The uniq parameter determines, if all reads mapping to exactly the same genomic position should be kept (uniq = 0), replaced by only one representative (uniq = 1), or if the number of stacked reads should be capped by a maximal number of stacked reads per genomic position determined by a poisson distribution of stacked reads genome wide and by a given p-value (1 > uniq > 0) (deafult: 1e-3). The smaller the p-value, the more reads at the same genomic position are potentially allowed.

chr.select

only data at the specified chromosomes will be processed.

window_size

defines the genomic resolution by which short read coverage is calculated.

paired

option for paired end reads

sample_name

name of the sample to be stored with the MEDIPS SET.

bwa

Indicates, if the alignment file has been generated by bwa (default=FALSE). Enabling bwa allows that the first mate of pairs can be the 'left' or the 'right' mate.

Value

An object of class MEDIPSset.

Author(s)

Lukas Chavez, Mathias Lienhard, Isaac Lopez Moyado

Examples


library("BSgenome.Hsapiens.UCSC.hg19")
bam.file.hESCs.Rep1.MeDIP = system.file("extdata", "hESCs.MeDIP.Rep1.chr22.bam", package="MEDIPSData")

MSet=MEDIPS.createSet(file=bam.file.hESCs.Rep1.MeDIP, BSgenome="BSgenome.Hsapiens.UCSC.hg19", chr.select="chr22", extend=250, shift=0, uniq=1e-3)

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(MEDIPS)
Loading required package: BSgenome
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: 'BiocGenerics'

The following objects are masked from 'package:parallel':

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from 'package:stats':

    IQR, mad, xtabs

The following objects are masked from 'package:base':

    Filter, Find, Map, Position, Reduce, anyDuplicated, append,
    as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
    get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
    match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
    rbind, rownames, sapply, setdiff, sort, table, tapply, union,
    unique, unsplit

Loading required package: S4Vectors
Loading required package: stats4

Attaching package: 'S4Vectors'

The following objects are masked from 'package:base':

    colMeans, colSums, expand.grid, rowMeans, rowSums

Loading required package: IRanges
Loading required package: GenomeInfoDb
Loading required package: GenomicRanges
Loading required package: Biostrings
Loading required package: XVector
Loading required package: rtracklayer
Loading required package: Rsamtools
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/MEDIPS/MEDIPS.createSet.Rd_%03d_medium.png", width=480, height=480)
> ### Name: MEDIPS.createSet
> ### Title: Creates a MEDIPS SET by reading a suitable input file
> ### Aliases: MEDIPS.createSet getMObjectFromWIG readRegionsFile
> ###   getPairedGRange scanBamToGRanges
> 
> ### ** Examples
> 
> 
> library("BSgenome.Hsapiens.UCSC.hg19")
> bam.file.hESCs.Rep1.MeDIP = system.file("extdata", "hESCs.MeDIP.Rep1.chr22.bam", package="MEDIPSData")
> 
> MSet=MEDIPS.createSet(file=bam.file.hESCs.Rep1.MeDIP, BSgenome="BSgenome.Hsapiens.UCSC.hg19", chr.select="chr22", extend=250, shift=0, uniq=1e-3)
Reading bam alignment hESCs.MeDIP.Rep1.chr22.bam 
Selecting  chr22 
Total number of imported short reads: 152586
Extending reads...
Creating GRange Object...
Keep at most 1 read(s) mapping to the same genomic location
Number of remaining reads: 150793
Calculating genomic coordinates...
Creating Granges object for genome wide windows...
Calculating short read coverage at genome wide windows...
> 
> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>