Estimate size factors, which are scaling factors used as "offsets"
by the statistical model to make the different samples comparable.
This is necessary because the different samples may have been
sequenced to slightly different depths. Additionally, the presence
of differentially expressed genes may cause the apparent depth of
many genes to appear different.
This function uses the "geometric" size factor normalization method,
which is identical to the one used by DESeq, DESeq2, DEXSeq, and
the default method used by CuffDiff.
This function is called internally by the runJunctionSeqAnalyses
function, and thus for most purposes users should not need to call this
function directly. It may be useful to advanced users performing non-standard
analyses.
A JunctionSeqCountSet. Usually initially created by
readJunctionSeqCounts. Size factors must be
set, usually using functions estimateSizeFactors and
estimateJunctionSeqDispersions.
method.sizeFactors
Character string. Can be used to apply alternative methodologies or implementations. Intended for advanced users who have strong opinions about the underlying statistical methodologies.
Determines the method used to calculate normalization size factors. By default JunctionSeq uses gene-level expression. As an alternative, feature-level counts can be used as
they are in DEXSeq. In practice the difference is almost always negligible.
replicateDEXSeqBehavior.useRawBaseMean
USED ONLY FOR INTERNAL TESTING! NOT INTENDED FOR ACTUAL USE!
This variable activates an alternative mode in which a (very minor) bug in DEXSeq v1.14.0 and earlier is replicated. If TRUE, the
baseMean and baseVar variables will be computed using raw counts rather than normalized counts.
This is used for internal tests in which DEXSeq functionality is replicated precisely and the results are compared against equivalent DEXSeq results.
Without this option the results would differ slightly (generally by less than 1 hundreth of a percent).
USED ONLY FOR INTERNAL TESTING! NOT INTENDED FOR ACTUAL USE!
calcAltSF
Logical. Determines whether both types of size factor calculations should be generated, and placed in the jscs@altSizeFactors slot.
verbose
if TRUE, send debugging and progress messages to the console / stdout.
file
A file path to write the size factor table.
...
If using the (depreciated) estimateSizeFactors command, use the same syntax as above.
Value
A JunctionSeqCountSet, with size factors included.
Examples
data(exampleDataSet,package="JctSeqData");
jscs <- estimateJunctionSeqSizeFactors(jscs);
## Not run:
########################################
#Set up example data:
decoder.file <- system.file(
"extdata/annoFiles/decoder.bySample.txt",
package="JctSeqData");
decoder <- read.table(decoder.file,
header=TRUE,
stringsAsFactors=FALSE);
gff.file <- system.file(
"extdata/cts/withNovel.forJunctionSeq.gff.gz",
package="JctSeqData");
countFiles <- system.file(paste0("extdata/cts/",
decoder$sample.ID,
"/QC.spliceJunctionAndExonCounts.withNovel.forJunctionSeq.txt.gz"),
package="JctSeqData");
########################################
#Advanced Analysis:
#Make a "design" dataframe:
design <- data.frame(condition = factor(decoder$group.ID));
#Read the QoRTs counts.
jscs = readJunctionSeqCounts(countfiles = countFiles,
samplenames = decoder$sample.ID,
design = design,
flat.gff.file = gff.file
);
#Generate the size factors and load them into the JunctionSeqCountSet:
jscs <- estimateJunctionSeqSizeFactors(jscs);
#Estimate feature-specific dispersions:
jscs <- estimateJunctionSeqDispersions(jscs);
#Fit dispersion function and estimate MAP dispersion:
jscs <- fitJunctionSeqDispersionFunction(jscs);
#Test for differential usage:
jscs <- testForDiffUsage(jscs);
#Estimate effect sizes and expression estimates:
jscs <- estimateEffectSizes( jscs);
## End(Not run)
Results
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(JunctionSeq)
Loading required package: SummarizedExperiment
Loading required package: GenomicRanges
Loading required package: BiocGenerics
Loading required package: parallel
Attaching package: 'BiocGenerics'
The following objects are masked from 'package:parallel':
clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
clusterExport, clusterMap, parApply, parCapply, parLapply,
parLapplyLB, parRapply, parSapply, parSapplyLB
The following objects are masked from 'package:stats':
IQR, mad, xtabs
The following objects are masked from 'package:base':
Filter, Find, Map, Position, Reduce, anyDuplicated, append,
as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
rbind, rownames, sapply, setdiff, sort, table, tapply, union,
unique, unsplit
Loading required package: S4Vectors
Loading required package: stats4
Attaching package: 'S4Vectors'
The following objects are masked from 'package:base':
colMeans, colSums, expand.grid, rowMeans, rowSums
Loading required package: IRanges
Loading required package: GenomeInfoDb
Loading required package: Biobase
Welcome to Bioconductor
Vignettes contain introductory material; view with
'browseVignettes()'. To cite Bioconductor, see
'citation("Biobase")', and for packages 'citation("pkgname")'.
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/JunctionSeq/estimateJunctionSeqSizeFactors.Rd_%03d_medium.png", width=480, height=480)
> ### Name: estimateJunctionSeqSizeFactors
> ### Title: Estimate Size Factors
> ### Aliases: estimateJunctionSeqSizeFactors writeSizeFactors
>
> ### ** Examples
>
>
> data(exampleDataSet,package="JctSeqData");
> jscs <- estimateJunctionSeqSizeFactors(jscs);
>
> ## Not run:
> ##D ########################################
> ##D #Set up example data:
> ##D decoder.file <- system.file(
> ##D "extdata/annoFiles/decoder.bySample.txt",
> ##D package="JctSeqData");
> ##D decoder <- read.table(decoder.file,
> ##D header=TRUE,
> ##D stringsAsFactors=FALSE);
> ##D gff.file <- system.file(
> ##D "extdata/cts/withNovel.forJunctionSeq.gff.gz",
> ##D package="JctSeqData");
> ##D countFiles <- system.file(paste0("extdata/cts/",
> ##D decoder$sample.ID,
> ##D "/QC.spliceJunctionAndExonCounts.withNovel.forJunctionSeq.txt.gz"),
> ##D package="JctSeqData");
> ##D ########################################
> ##D #Advanced Analysis:
> ##D
> ##D #Make a "design" dataframe:
> ##D design <- data.frame(condition = factor(decoder$group.ID));
> ##D #Read the QoRTs counts.
> ##D jscs = readJunctionSeqCounts(countfiles = countFiles,
> ##D samplenames = decoder$sample.ID,
> ##D design = design,
> ##D flat.gff.file = gff.file
> ##D );
> ##D #Generate the size factors and load them into the JunctionSeqCountSet:
> ##D jscs <- estimateJunctionSeqSizeFactors(jscs);
> ##D #Estimate feature-specific dispersions:
> ##D jscs <- estimateJunctionSeqDispersions(jscs);
> ##D #Fit dispersion function and estimate MAP dispersion:
> ##D jscs <- fitJunctionSeqDispersionFunction(jscs);
> ##D #Test for differential usage:
> ##D jscs <- testForDiffUsage(jscs);
> ##D #Estimate effect sizes and expression estimates:
> ##D jscs <- estimateEffectSizes( jscs);
> ##D
> ## End(Not run)
>
>
>
>
>
> dev.off()
null device
1
>