Expression mask results for a range of cutoff values are
compared with an external mask (for example a mask based on sequence
data) and type 1 and type 2 errors are estimated.
A matrix with 3 columns. The first and second column
represent the x and y coordinates on the Microarray. The third column contains
a quality entry for each probe, e.g. the quality score obtained
from mask analysis.
seqdata
A matrix with 3 columns containing x, y coordinates and
0,1 entries in column 3, defining whether a probe
has a sequence difference (0) or not.
cutoffs
A vector including all cutoff values for the quality
scores of an expression mask that should be used for the error
analysis. If no cutoffs are given (default is "none") the cutoffs
are the quantiles of the quality scores starting from 0 to 1 in
steps of 0.01.
wilcox.ks
Logical, default=FALSE element determining whether the Kolmogorow-Smirnow
Test and Wilcoxon Rank Test analysis should be performed (see reference below).
sample
To compare the p value distribution with the
Kolmogorow-Smirnow Test and Wilcoxon Rank Test for
different cutoffs, the sampling option can be used to compute the quality
score distribution for different cutoffs. This value indicates how often
the sampling should be performed.
plotCutoffs
Logical, default=TRUE element determining whether the cutoffs
should be drawn in the overlap plot.
verbose
Logical. If 'TRUE', it writes out some messages
indicating progress. If 'FALSE' nothing should be printed.
Details
The function overlapExprExtMasks compares expression mask
results with an external (for example sequence-based) mask and might
help to choose a quality score cutoff for masking probes.
Value
A list of five objects will be returned.
type1
A vector of the type 1 error for each cutoff.
type2
A vector of the type 2 error for each cutoff.
confT1
A matrix with the upper (column 1) and lower (column
2) confidence intervals for the type 1 error.
confT2
A matrix with the upper (column 1) and lower (column
2) confidence intervals for the type 2 error.
ksP
If wilcox.ks is 'TRUE', a vector of quality scores
from a two sample Kolmogorov-Smirnov comparing distributions
of quality score for probes designated as BAD and not in external
mask.
wilcoxonP
If wilcox.ks is 'TRUE', a vector of quality
scores from a two sample Wilcoxon rank test comparing distributions
of quality score for probes designated as BAD and not in external
mask.
ksBoot
For each cutoff sample(default=10) times cutoff
values for the Kolmogorov-Smirnov test will be generated.
wilcoxBoot
For each cutoff sample(default=10) times cutoff
values for the wilcoxon rank sum test will be generated.
cutoffs
List of cutoffs used for the error analysis
testCutoffs
If wilcox.ks is 'TRUE', a list with cutoff
information will be provided. The first list entry includes all
cutoffs used in the two sample Kolmogorov-Smirnov test and the two
sample wilcoxon rank sum test analysis will be produced. A cutoff
can appear sample(default=10) times. In theory there should be
sample times the number of cutoff values entries in this vector, but
usually there are fewer entries, because for certain cutoff values,
it is not possible to calculate the exact p value in one of the
tests. The second list entry transforms the cutoffs in ranks and can
be used for the plotting of the test results.
Author(s)
Michael Dannemann
References
Dannemann et al, The effects of probe binding
affinity differences on gene expression measurements and how to deal
with them. Bioinformatics 2009
See Also
mask, prepareMaskedAffybatch,
plotProbe
Examples
## loading mask on all genes (exmask1) of the same dataset
data(exmask)
overlapExSeq <- overlapExprExtMasks(exmask$probes[,1:3],sequenceMask[,c(1,2,4)])
## plot results
plot(overlapExSeq$type1,overlapExSeq$type2,type="l",col="red",
main="Overlap expression based mask - sequence based mask",xlab="Type 1",ylab="Type 2")
abline(1,-1,col="gray")
## performing wilcoxon rank sum test and Kolmogorov-Smirnov test on
## expression mask with all genes (exmask)
overlapTests <-
overlapExprExtMasks(exmask$probes[,1:3],sequenceMask[,c(1,2,4)],wilcox.ks=TRUE)
layout(matrix(1:2,ncol=1))
plot(overlapTests$testCutoff[[1]],overlapTests$ksBoot,col="red",main="Kolmogorov-Smirnov Test",xlab="Quality score cutoff",
ylab="p value (Kolmogorov-Smirnov Test)",ylim=c(0,1),pch=16,xaxt="n")
axis(1,at=1:length(unique(overlapTests$testCutoff[[2]])),labels=signif(unique(overlapTests$testCutoff[[2]]),2),las=3)
lines(which(unique(overlapTests$testCutoff[[2]]) %in% overlapTests$testCutoff[[2]]),overlapTests$ksP[!is.na(overlapTests$ksP)],type="p",pch=16,cex=0.8)
plot(overlapTests$testCutoff[[1]],overlapTests$wilcoxonBoot,col="green",main="Wilcoxon Rank Sum Test",xlab="Quality score cutoff",
ylab="p value (Wilcoxon Rank Sum Test)",ylim=c(0,1),pch=16,xaxt="n")
axis(1,at=1:length(unique(overlapTests$testCutoff[[2]])),labels=signif(unique(overlapTests$testCutoff[[2]]),2),las=3)
lines(which(unique(overlapTests$testCutoff[[2]]) %in% overlapTests$testCutoff[[2]]),overlapTests$wilcoxonP[!is.na(overlapTests$wilcoxonP)],type="p",pch=16,cex=0.8)
Results
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(maskBAD)
Loading required package: gcrma
Loading required package: affy
Loading required package: BiocGenerics
Loading required package: parallel
Attaching package: 'BiocGenerics'
The following objects are masked from 'package:parallel':
clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
clusterExport, clusterMap, parApply, parCapply, parLapply,
parLapplyLB, parRapply, parSapply, parSapplyLB
The following objects are masked from 'package:stats':
IQR, mad, xtabs
The following objects are masked from 'package:base':
Filter, Find, Map, Position, Reduce, anyDuplicated, append,
as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
rbind, rownames, sapply, setdiff, sort, table, tapply, union,
unique, unsplit
Loading required package: Biobase
Welcome to Bioconductor
Vignettes contain introductory material; view with
'browseVignettes()'. To cite Bioconductor, see
'citation("Biobase")', and for packages 'citation("pkgname")'.
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/maskBAD/overlapExprExtMasks.Rd_%03d_medium.png", width=480, height=480)
> ### Name: overlapExprExtMasks
> ### Title: Error Analysis of Masking Results
> ### Aliases: overlapExprExtMasks
> ### Keywords: internal
>
> ### ** Examples
>
> ## loading mask on all genes (exmask1) of the same dataset
> data(exmask)
> overlapExSeq <- overlapExprExtMasks(exmask$probes[,1:3],sequenceMask[,c(1,2,4)])
0.99 % 2 % 3 % 4 % 5 % 5.9 % 6.9 % 7.9 % 8.9 % 9.9 % 11 % 12 % 13 % 14 % 15 % 16 % 17 % 18 % 19 % 20 % 21 % 22 % 23 % 24 % 25 % 26 % 27 % 28 % 29 % 30 % 31 % 32 % 33 % 34 % 35 % 36 % 37 % 38 % 39 % 40 % 41 % 42 % 43 % 44 % 45 % 46 % 47 % 48 % 49 % 50 % 50 % 51 % 52 % 53 % 54 % 55 % 56 % 57 % 58 % 59 % 60 % 61 % 62 % 63 % 64 % 65 % 66 % 67 % 68 % 69 % 70 % 71 % 72 % 73 % 74 % 75 % 76 % 77 % 78 % 79 % 80 % 81 % 82 % 83 % 84 % 85 % 86 % 87 % 88 % 89 % 90 % 91 % 92 % 93 % 94 % 95 % 96 % 97 % 98 % 99 % 100 % >
> ## plot results
> plot(overlapExSeq$type1,overlapExSeq$type2,type="l",col="red",
+ main="Overlap expression based mask - sequence based mask",xlab="Type 1",ylab="Type 2")
> abline(1,-1,col="gray")
>
> ## performing wilcoxon rank sum test and Kolmogorov-Smirnov test on
> ## expression mask with all genes (exmask)
> overlapTests <-
+ overlapExprExtMasks(exmask$probes[,1:3],sequenceMask[,c(1,2,4)],wilcox.ks=TRUE)
0.99 % 2 % 3 % 4 % 5 % 5.9 % 6.9 % 7.9 % 8.9 % 9.9 % 11 % 12 % 13 % 14 % 15 % 16 % 17 % 18 % 19 % 20 % 21 % 22 % 23 % 24 % 25 % 26 % 27 % 28 % 29 % 30 % 31 % 32 % 33 % 34 % 35 % 36 % 37 % 38 % 39 % 40 % 41 % 42 % 43 % 44 % 45 % 46 % 47 % 48 % 49 % 50 % 50 % 51 % 52 % 53 % 54 % 55 % 56 % 57 % 58 % 59 % 60 % 61 % 62 % 63 % 64 % 65 % 66 % 67 % 68 % 69 % 70 % 71 % 72 % 73 % 74 % 75 % 76 % 77 % 78 % 79 % 80 % 81 % 82 % 83 % 84 % 85 % 86 % 87 % 88 % 89 % 90 % 91 % 92 % 93 % 94 % 95 % 96 % 97 % 98 % 99 % 100 % > layout(matrix(1:2,ncol=1))
> plot(overlapTests$testCutoff[[1]],overlapTests$ksBoot,col="red",main="Kolmogorov-Smirnov Test",xlab="Quality score cutoff",
+ ylab="p value (Kolmogorov-Smirnov Test)",ylim=c(0,1),pch=16,xaxt="n")
> axis(1,at=1:length(unique(overlapTests$testCutoff[[2]])),labels=signif(unique(overlapTests$testCutoff[[2]]),2),las=3)
> lines(which(unique(overlapTests$testCutoff[[2]]) %in% overlapTests$testCutoff[[2]]),overlapTests$ksP[!is.na(overlapTests$ksP)],type="p",pch=16,cex=0.8)
> plot(overlapTests$testCutoff[[1]],overlapTests$wilcoxonBoot,col="green",main="Wilcoxon Rank Sum Test",xlab="Quality score cutoff",
+ ylab="p value (Wilcoxon Rank Sum Test)",ylim=c(0,1),pch=16,xaxt="n")
> axis(1,at=1:length(unique(overlapTests$testCutoff[[2]])),labels=signif(unique(overlapTests$testCutoff[[2]]),2),las=3)
> lines(which(unique(overlapTests$testCutoff[[2]]) %in% overlapTests$testCutoff[[2]]),overlapTests$wilcoxonP[!is.na(overlapTests$wilcoxonP)],type="p",pch=16,cex=0.8)
>
>
>
>
>
>
> dev.off()
null device
1
>