R Graphical Manual

Browse All

Last data update: 2014.03.03

R: Determine Normalisation factors

getNormFactors

R Documentation

Determine Normalisation factors

Description

Determine normalisation factors for a specified set of samples. Potentially only a subset of the peaks can be used to determine normalisation factors. The determined factors can be accessed with DBA$MD$NormFactors. Normalised total counts are additionally computed and stored at DBA$MD$NormTotalCounts.

Usage

getNormFactors(DBA, method = "DESeq", SampleIDs = NULL, Usefiltered = TRUE,
PeakIDs = NULL, overWrite = FALSE)

Arguments

`DBA`	DBA object after running getPeakProfiles.
`method`	currently only the DESeq normalisation method is implemented [1].
`SampleIDs`	State which samples should be normalised; if NULL all are used.
`Usefiltered`	If TRUE, only peaks that have passed the filter to detect Outliers are used. findOutlier() must be run first, otherwise all peaks are used
`PeakIDs`	Specify a subset of peaks to be used to determine normalisation factors; If NULL all peaks are used.
`overWrite`	If TRUE, previous computed NormFactors and NormTotalCounts are overwritten

Value

DBA object, with additional list elements NormFactors and NormTotalCounts appended to MD. Note, that if you call getNormFactors several times with different parameters, you can have more than one set of normalisation factors appended. However, NormTotalCounts will be overwritten unless specified otherwise.

Author(s)

Gabriele Schweikert

References

[1] Anders S. and Huber W. (2010). Differential expression analysis for sequence count data Genome Biology, 11 (10): R106

Examples


# load DBA objects with peak profiles 

data(Cfp1Profiles)
Cfp1Norm <- getNormFactors(Cfp1Profiles)
Cfp1Norm$MD$NormFactors

# compare total counts before and after normalisation:
boxplot(Cfp1Norm$MD$RawTotalCounts[,1:3], ylim=c(0,2000))
boxplot(Cfp1Norm$MD$NormTotalCounts[,1:3], ylim=c(0,2000))

# compare individual peak profiles before and after normalisation,
# using plotPeak, e.g.:

plotPeak(Cfp1Norm, Peak.id=20, NormMethod = NULL)

plotPeak(Cfp1Norm, Peak.id=20, NormMethod = 'DESeq')




# You can also specify a subset of samples which should be normalised, e.g:

SampleIDs <- c("WT.AB2", "Null.AB2")
Cfp1Norm2 <- getNormFactors(Cfp1Profiles, SampleIDs=SampleIDs)

# Or you can specify a subset of peaks which should be used to determine
# the normalisation factors. For example run findOutliers:

Cfp1 <- findOutliers(Cfp1Profiles, range=5)
PeakIDs <- Cfp1$MD$Filter$FiltPeakIds
Cfp1Norm3 <- getNormFactors(Cfp1, PeakIDs = PeakIDs)

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(MMDiff)
Loading required package: GenomicRanges
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: 'BiocGenerics'

The following objects are masked from 'package:parallel':

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from 'package:stats':

    IQR, mad, xtabs

The following objects are masked from 'package:base':

    Filter, Find, Map, Position, Reduce, anyDuplicated, append,
    as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
    get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
    match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
    rbind, rownames, sapply, setdiff, sort, table, tapply, union,
    unique, unsplit

Loading required package: S4Vectors
Loading required package: stats4

Attaching package: 'S4Vectors'

The following objects are masked from 'package:base':

    colMeans, colSums, expand.grid, rowMeans, rowSums

Loading required package: IRanges
Loading required package: GenomeInfoDb
Loading required package: DiffBind
Loading required package: SummarizedExperiment
Loading required package: Biobase
Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.


Loading required package: GMD
Loading required package: Rsamtools
Loading required package: Biostrings
Loading required package: XVector
Warning message:
Package 'MMDiff' is deprecated and will be removed from Bioconductor
  version 3.4 
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/MMDiff/getNormFactors.Rd_%03d_medium.png", width=480, height=480)
> ### Name: getNormFactors
> ### Title: Determine Normalisation factors
> ### Aliases: getNormFactors
> 
> ### ** Examples
> 
> 
> # load DBA objects with peak profiles 
> 
> data(Cfp1Profiles)
> Cfp1Norm <- getNormFactors(Cfp1Profiles)
Computing Scaling factor according to DESeq normalization method
Using all Samples: nSamples = 3
Samples:
[1] "WT.AB2"   "Null.AB2" "Resc.AB2"
Using unfiltered Peaks
nPeaks = 1000 (of 1000)
appending NormTotalCounts
Determined Factors:
$`WT.AB2,Null.AB2,Resc.AB2`
     WT_2    Null_2    Resc_2 
0.9439729 0.8899213 1.1986945 

> Cfp1Norm$MD$NormFactors
$DESeq
$DESeq$`WT.AB2,Null.AB2,Resc.AB2`
     WT_2    Null_2    Resc_2 
0.9439729 0.8899213 1.1986945 


> 
> # compare total counts before and after normalisation:
> boxplot(Cfp1Norm$MD$RawTotalCounts[,1:3], ylim=c(0,2000))
> boxplot(Cfp1Norm$MD$NormTotalCounts[,1:3], ylim=c(0,2000))
> 
> # compare individual peak profiles before and after normalisation,
> # using plotPeak, e.g.:
> 
> plotPeak(Cfp1Norm, Peak.id=20, NormMethod = NULL)
No normalization factors applied
> 
> plotPeak(Cfp1Norm, Peak.id=20, NormMethod = 'DESeq')
> 
> 
> 
> 
> # You can also specify a subset of samples which should be normalised, e.g:
> 
> SampleIDs <- c("WT.AB2", "Null.AB2")
> Cfp1Norm2 <- getNormFactors(Cfp1Profiles, SampleIDs=SampleIDs)
Computing Scaling factor according to DESeq normalization method
Using subset of samples: nSamples = 2 (of 3)
Samples:
[1] "WT.AB2"   "Null.AB2"
Using unfiltered Peaks
nPeaks = 1000 (of 1000)
appending NormTotalCounts
Determined Factors:
$`WT.AB2,Null.AB2`
     WT_2    Null_2 
1.0249705 0.9756378 

> 
> # Or you can specify a subset of peaks which should be used to determine
> # the normalisation factors. For example run findOutliers:
> 
> Cfp1 <- findOutliers(Cfp1Profiles, range=5)
Error in dev.new() : no suitable unused file name for pdf()
Calls: findOutliers -> dev.new
Execution halted