Last data update: 2014.03.03

R: Peak scoring function
peakScoringR Documentation

Peak scoring function

Description

Scores peaks detected with function peakDetection according the height and the sharpness (width) of the peak. This function can be called automatically from peakDetection if score=TRUE.

Usage

	## S4 method for signature 'numeric'
peakScoring(peaks, data, threshold="25%")	
	## S4 method for signature 'list'
peakScoring(peaks, data, threshold="25%", mc.cores=1)
	## S4 method for signature 'IRanges'
peakScoring(peaks, data, threshold="25%", weight.width=1, weight.height=1, dyad.length=38)
	## S4 method for signature 'IRangesList'
peakScoring(peaks, data, threshold="25%", weight.width=1, weight.height=1, dyad.length=38, mc.cores=1)

Arguments

peaks

The identified peaks resulting from peakDetection. Could be a numeric vector with the position of the peaks, or a IRanges object with the extended range of the peak. For both types, list support is implemented as a numeric list or a IRangesList

data

Data of nucleosome coverage or intensites.

threshold

The non-default threshold previously used in peakDetection function, if applicable. Can be given as a percentage string (i.e., "25%" will use the value in the 1st quantile of data) or as an absolute coverage numeric value (i.e., 20 will not look for peaks in regions without less than 20 reads (or reads per milion)).

dyad.length

How many bases account in the nucleosome dyad for sharpness description. If working with NGS data, works best with the reads width value for single-ended data or the trim value given to the processReads function.

weight.height, weight.width

If the score is a range, the height and the widht score (coverage and fuzzynes) can be defined with different weigths with these parameters. See details.

mc.cores

If input is a list or IRangeList, and multiple cores support is available, the maximum number of cores for parallel processing

Details

This function scores each previously identified peak according its height and sharpness.

The height score (score_h) tells how large is a peak, higher means more coverage or intensity, so better positioned nucleosome. This score is obtained by checking the observed peak value in a Normal distribution with the mean and sd of data. This value is between 0 and 1.

The width score (score_w) is a mesure of how sharp is a peak. With a NGS coverage in mind, a perfect phased (well-positioned) nucleosome is this that starts and ends exactly in the same place many times. The shape of this ideal peak will be a rectangular shape of the lenght of the read. A wider top of a peak could indicate fuzzyness. The parameter dyad.length tells how long should be the "flat" region of an ideal peak. The optimum value for this parameter is the lenght of the read in single-ended data or the trim value of the function processReads. For Tiling Array, the default value should be fine. This score is obtained calculating the ratio between the mean of the nucleosome scope (the one provided by range in the elements of peaks) and the dyad.length central bases. This value is normalized between 0 and 1.

For punctual, single points peaks (provided by numeric vector or list as peaks attribute) the score returned is the height score.

For range peaks the weighted sum of the heigth and width scores is used. This is: ((score_h * weigth.height) / sum.wei) + ((score_w * weigth.widht) / sum.wei). Note that you can query for only one score by weting its weight to 1 and the other to 0.

Value

In the case of numeric input, the value returned is a data.frame containing a 'peak' and a 'score' column. If the input is a list, the result will be a list of data.frame.

If input is a IRanges or IRangesList, the result will be a RangedData object with one or multiple spaces respectively and a 3 data column with the mixed, width and heigh score.

Author(s)

Oscar Flores oflores@mmb.cpb.ub.es

See Also

peakDetection, processReads,

Examples


	#Generate a synthetic map
	map = syntheticNucMap(nuc.len=40, lin.len=130) #Trimmed length nucleosome map
	
	#Get the information of dyads and the coverage
	peaks = c(map$wp.starts, map$fz.starts)
	cover = filterFFT(coverage(map$syn.reads))

	#Calculate the scores
	scores = peakScoring(peaks, cover)
	plotPeaks(scores$peak, cover, scores=scores$score, start=5000, end=10000)

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(nucleR)
Loading required package: ShortRead
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: 'BiocGenerics'

The following objects are masked from 'package:parallel':

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from 'package:stats':

    IQR, mad, xtabs

The following objects are masked from 'package:base':

    Filter, Find, Map, Position, Reduce, anyDuplicated, append,
    as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
    get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
    match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
    rbind, rownames, sapply, setdiff, sort, table, tapply, union,
    unique, unsplit

Loading required package: BiocParallel
Loading required package: Biostrings
Loading required package: S4Vectors
Loading required package: stats4

Attaching package: 'S4Vectors'

The following objects are masked from 'package:base':

    colMeans, colSums, expand.grid, rowMeans, rowSums

Loading required package: IRanges
Loading required package: XVector
Loading required package: Rsamtools
Loading required package: GenomeInfoDb
Loading required package: GenomicRanges
Loading required package: GenomicAlignments
Loading required package: SummarizedExperiment
Loading required package: Biobase
Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.

> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/nucleR/peakScoring.Rd_%03d_medium.png", width=480, height=480)
> ### Name: peakScoring
> ### Title: Peak scoring function
> ### Aliases: peakScoring peakScoring,list-method
> ###   peakScoring,IRangesList-method peakScoring,numeric-method
> ###   peakScoring,IRanges-method
> ### Keywords: manip
> 
> ### ** Examples
> 
> 
> 	#Generate a synthetic map
> 	map = syntheticNucMap(nuc.len=40, lin.len=130) #Trimmed length nucleosome map
> 	
> 	#Get the information of dyads and the coverage
> 	peaks = c(map$wp.starts, map$fz.starts)
> 	cover = filterFFT(coverage(map$syn.reads))
> 
> 	#Calculate the scores
> 	scores = peakScoring(peaks, cover)
> 	plotPeaks(scores$peak, cover, scores=scores$score, start=5000, end=10000)
> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>