An IntensityData object containing BAlleleFreq and LogRRatio.
The order of the rows of intenData and the snp annotation
are expected to be by chromosome and then by position within chromosome.
genoData
A GenotypeData object. The order of the rows of intenData and the snp annotation
are expected to be by chromosome and then by position within chromosome.
snp.ids
vector of eligible SNP ids. Usually exclude failed and intensity-only SNPS.
Also recommended to exclude an HLA region on chromosome 6 and
XTR region on X chromosome. See HLA and pseudoautosomal.
If there are SNPs annotated in the centromere gap, exclude these as
well (see centromeres).
x
anom
data.frame of detected chromosome anomalies. Names must include "scanID",
"chromosome", "left.index", "right.index", "sex", "method", "anom.id".
Valid values for "method" are "BAF" or "LOH" referring to whether the anomaly
was detected by BAF method (anomDetectBAF) or by LOH method
(anomDetectLOH).
Here "left.index" and "right.index" are row indices of intenData with left.index < right.index.
centromere
data.frame with centromere position info. Names must include
"chrom", "left.base", "right.base". Valid values for "chrom" are
1:22, "X", "Y", "XY". Here "left.base" and "right.base"
are start and end base positions of the centromere location,
respectively. Centromere data tables are provided in centromeres.
lrr.cut
count the number of eligible LRR values less than lrr.cut
verbose
whether to print the scan id currently being processed
anom.stats
data.frame of chromosome anomalies with statistics, usually the output
of anomSegStats. Names must include "anom.id", "scanID", "chromosome",
"left.index", "right.index", "method", "nmark.all", "nmark.elig", "left.base", "right.base",
"nbase", "non.anom.baf.med", "non.anom.lrr.med", "anom.baf.dev.med",
"anom.baf.dev.5", "anom.lrr.med", "nmark.baf", "nmark.lrr". Left and right refer
to start and end, respectively, of the anomaly, in position order.
snp.ineligible
vector of ineligible snp ids (e.g., intensity-only, failed snps, XTR and HLA regions).
See HLA and pseudoautosomal.
plot.ineligible
whether or not to include ineligible points in the plot for LogRRatio
brackets
type of brackets to plot around breakpoints - none, use base length, use number of markers (note that using markers give asymmetric brackets);
could be used, along with brkpt.pct, to assess general accuracy of end points of the anomaly
brkpt.pct
percent of anomaly length in bases (or number of markers) for width of brackets
whole.chrom
logical to plot the whole chromosome or not (overrides win and zoom)
win
size of the window (a multiple of anomaly length) surrounding the anomaly to plot
win.calc
logical to calculate window size from anomaly length; overrides win and gives window of fixed length given by win.fixed
win.fixed
number of megabases for window size when win.calc=TRUE
zoom
indicates whether plot includes the whole anomaly ("both") or zooms on just the left or right breakpoint; "both" is default
main
Vector of titles for upper (LRR) plots. If NULL, titles will
include anom.id, scanID, sex, chromosome, and detection method.
info
character vector of extra information to include in the main title of
the upper (LRR) plot
ideogram
logical for whether to plot a chromosome ideogram under
the BAF and LRR plots.
ideo.zoom
logical for whether to zoom in on the ideogram to
match the range of the BAF/LRR plots
ideo.rect
logical for whether to draw a rectangle on the
ideogram indicating the range of the BAF/LRR plots
mult.anom
logical for whether to plot multiple anomalies from
the same scan-chromosome pair on a single plot. If FALSE
(default), each anomaly is shown on a separate plot.
cex
cex value for points on the plots
cex.leg
cex value for the ideogram legend
colors
Color scheme to use for genotypes. "default" is colorblind safe (colorbrewer Set2), "neon" is bright orange/green/fuschia, and "primary" is red/green/blue.
...
Other parameters to be passed directly to plot.
Details
anomSegStats computes various statistics of the input anomalies.
Some of these are basic statistics for the characteristics of the anomaly and for measuring deviation of LRR or BAF from expected.
Other statistics are used in downstrean quality control analysis, including detecting
terminal anomalies and investigating centromere-spanning anomalies.
anomStatsPlot produces separate png images of each anomaly in anom.stats. Each image consists of
an upper plot of LogRRatio values and a lower plot of BAlleleFrequency values for
a zoomed region around the anomaly or whole chromosome (depending up parameter
choices). Each plot has vertical lines demarcating the anomaly and horizontal lines
displaying certain statistics from anomSegStats. The upper plot title
includes sample number and chromosome. Further plot annotation describes which
anomaly statistics are represented.
Value
anomSegStats produces a data.frame with the variables for anom plus the following columns:
Left and right refer to position order with left < right.
nmark.all
total number of SNP markers on the array from left.index to right.index inclusive
nmark.elig
total number of eligible SNP markers on the array from left.index to right.index, inclusive.
See snp.ids for definition of eligible SNP markers.
left.base
base position corresponding to left.index
right.base
base position corresponding to right.index
nbase
number of bases from left.index to right.index, inclusive
non.anom.baf.med
BAF median of non-anomalous segments on all autosomes for the associated sample,
using eligible heterozygous or missing SNP markers
non.anom.lrr.med
LRR median of non-anomalous segments on all autosomes for the associated sample,
using eligible SNP markers
non.anom.lrr.mad
MAD for LRR of non-anomalous segments on all autosomes for the associated sample, using eligible SNP markers
anom.baf.dev.med
BAF median of deviations from non.anom.baf.med of points used to detect anomaly (eligible and heterozygous or missing)
anom.baf.dev.5
median of BAF deviations from 0.5, using eligible heterozygous or missing SNP markers in anomaly
anom.baf.dev.mean
mean of BAF deviations from non.anom.baf.med, using eligible heterozygous or missing SNP markers in anomaly
anom.baf.sd
standard deviation of BAF deviations from non.anom.baf.med, using eligible heterozygous or missing SNP markers in anomaly
anom.baf.mad
MAD of BAF deviations from non.anom.baf.med, using eligible heterozygous or missing SNP markers in anomaly
anom.lrr.med
LRR median of eligible SNP markers within the anomaly
anom.lrr.sd
standard deviation of LRR for eligible SNP markers within the anomaly
anom.lrr.mad
MAD of LRR for eligible SNP markers within the anomaly
nmark.baf
number of SNP markers within the anomaly eligible for BAF detection (eligible markers that are heterozygous or missing)
nmark.lrr
number of SNP markers within the anomaly eligible for LOH detection (eligible markers)
cent.rel
position relative to centromere - left, right, span
left.most
T/F for whether the anomaly is the left-most anomaly for this sample-chromosome,
i.e. no other anomalies with smaller start base position
right.most
T/F whether the anomaly is the right-most anomaly for this sample-chromosome,
i.e. no other anomalies with larger end base position
left.last.elig
T/F for whether the anomaly contains the last eligible SNP marker going to the left (decreasing position)
right.last.elig
T/F for whether the anomaly contains the last eligible SNP marker going to the right (increasing position)
left.term.lrr.med
median of LRR for all eligible SNP markers from left-most eligible marker to the left telomere
(only calculated for the most distal anom)
right.term.lrr.med
median of LRR for all eligible markers from right-most eligible marker to the right telomere
(only calculated for the most distal anom)
left.term.lrr.n
sample size for calculating left.term.lrr.med
right.term.lrr.n
sample size for calculating right.term.lrr.med
cent.span.left.elig.n
number of eligible markers on the left side of centromere-spanning anomalies
cent.span.right.elig.n
number of eligible markers on the right side of centromere-spanning anomalies
cent.span.left.bases
length of anomaly (in bases) covered by eligible markers on the left side of the centromere
cent.span.right.bases
length of anomaly (in bases) covered by eligible markers on the right side of the centromere
cent.span.left.index
index of eligible marker left-adjacent to centromere;
recall that index refers to row indices of intenData
cent.span.right.index
index of elig marker right-adjacent to centromere
bafmetric.anom.mean
mean of BAF-metric values within anomaly, using eligible heterozygous or missing SNP markers BAF-metric values were used in the
detection of anomalies. See anomDetectBAF for definition of BAF-metric
bafmetric.non.anom.mean
mean of BAF-metric values within non-anomalous segments
across all autosomes for the associated sample, using eligible heterozygous or missing SNP markers
bafmetric.non.anom.sd
standard deviation of BAF-metric values within non-anomalous segments
across all autosomes for the associated sample, using eligible heterozygous or missing SNP markers
nmark.lrr.low
number of eligible markers within anomaly with LRR values less than lrr.cut
Note
The non-anomalous statistics are computed over all autosomes for
the sample associated with an anomaly. Therefore the accuracy of these statistics
relies on the input anomaly data.frame including all autosomal anomalies for a given sample.
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(GWASTools)
Loading required package: Biobase
Loading required package: BiocGenerics
Loading required package: parallel
Attaching package: 'BiocGenerics'
The following objects are masked from 'package:parallel':
clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
clusterExport, clusterMap, parApply, parCapply, parLapply,
parLapplyLB, parRapply, parSapply, parSapplyLB
The following objects are masked from 'package:stats':
IQR, mad, xtabs
The following objects are masked from 'package:base':
Filter, Find, Map, Position, Reduce, anyDuplicated, append,
as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
rbind, rownames, sapply, setdiff, sort, table, tapply, union,
unique, unsplit
Welcome to Bioconductor
Vignettes contain introductory material; view with
'browseVignettes()'. To cite Bioconductor, see
'citation("Biobase")', and for packages 'citation("pkgname")'.
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/GWASTools/anomSegStats.Rd_%03d_medium.png", width=480, height=480)
> ### Name: anomSegStats
> ### Title: Calculate LRR and BAF statistics for anomalous segments
> ### Aliases: anomSegStats anomStatsPlot
> ### Keywords: manip hplot
>
> ### ** Examples
>
> library(GWASdata)
> data(illuminaScanADF, illuminaSnpADF)
>
> blfile <- system.file("extdata", "illumina_bl.gds", package="GWASdata")
> bl <- GdsIntensityReader(blfile)
> blData <- IntensityData(bl, scanAnnot=illuminaScanADF, snpAnnot=illuminaSnpADF)
>
> genofile <- system.file("extdata", "illumina_geno.gds", package="GWASdata")
> geno <- GdsGenotypeReader(genofile)
> genoData <- GenotypeData(geno, scanAnnot=illuminaScanADF, snpAnnot=illuminaSnpADF)
>
> scan.ids <- illuminaScanADF$scanID[1:2]
> chrom.ids <- unique(illuminaSnpADF$chromosome)
> snp.ids <- illuminaSnpADF$snpID[illuminaSnpADF$missing.n1 < 1]
> snp.failed <- illuminaSnpADF$snpID[illuminaSnpADF$missing.n1 == 1]
>
> # example results from anomDetectBAF
> baf.anoms <- data.frame("scanID"=rep(scan.ids[1],2), "chromosome"=rep(21,2),
+ "left.index"=c(100,300), "right.index"=c(200,400), sex=rep("M",2),
+ method=rep("BAF",2), anom.id=1:2, stringsAsFactors=FALSE)
>
> # example results from anomDetectLOH
> loh.anoms <- data.frame("scanID"=scan.ids[2],"chromosome"=22,
+ "left.index"=400,"right.index"=500, sex="F", method="LOH",
+ anom.id=3, stringsAsFactors=FALSE)
>
> anoms <- rbind(baf.anoms, loh.anoms)
> data(centromeres.hg18)
> stats <- anomSegStats(blData, genoData, snp.ids=snp.ids, anom=anoms,
+ centromere=centromeres.hg18)
>
> anomStatsPlot(blData, genoData, anom.stats=stats,
+ snp.ineligible=snp.failed, centromere=centromeres.hg18)
>
> close(blData)
> close(genoData)
>
>
>
>
>
> dev.off()
null device
1
>