Last data update: 2014.03.03
R: Manhattan for GWAS
plotGrandLinear R Documentation
Manhattan for GWAS
Description
A Manhattan plot is special scatter plot used to visualize data with a
large number of data points, with a distribute of some higher-magnitude
values. For example, in the GWAS(genome-wide association studies). Here
we mainly focus on GWAS Manhattan plots. X-axis is genomic coordinates
and Y-axis is negative logarithm of the associated P-value for each
single nucleotide polymorphism. So higher the value, more stronger the
association they are.
Usage
plotGrandLinear(obj, ..., facets, space.skip = 0.01, geom = NULL,
cutoff = NULL, cutoff.color = "red", cutoff.size = 1,
legend = FALSE, xlim, ylim, xlab, ylab, main,
highlight.gr = NULL, highlight.name = NULL,
highlight.col = "red", highlight.label = TRUE,
highlight.label.size = 5, highlight.label.offset =
0.05, highlight.label.col = "black", spaceline =
FALSE)
Arguments
obj
GRanges
object which contains extra p value, before users
pass this object, they need to make sure the pvalue has been changed
to -log10(p).
...
extra arguments passed. such as color, size, alpha.
facets
facets formula, such as group ~ .
space.skip
numeric value for skip ratio, between chromosome spaces.default is 0.01.
geom
geometric object, defualt is "point".
cutoff
A numeric vector which used as cutoff for Manhattan plot.
cutoff.color
A character specifying the color used for cutoff. Default is "red".
cutoff.size
A numeric value which used as cutoff line size.
legend
A logical value indicate whether to show legend or not. Default is
FALSE which disabled the legend.
xlim
limits for x scale.
ylim
limits for y scale.
xlab
Label for xscale.
ylab
Label for yscale.
main
title.
highlight.gr
a GRanges
object, this wil highlight overlapped region with
provided intervals.
highlight.name
if NULL
, using rownames of GRanges
object provided by
argument highlight.gr
, otherwise use
character to indicate column used as labeled names.
highlight.col
highlight colors.
highlight.label
logical value, label the highlighted region of not.
highlight.label.size
highlight label size.
highlight.label.offset
highlight label offset.
highlight.label.col
highlight label color.
spaceline
show line between chromosomes.
Details
Please use seqlengths of the object and space.skip arguments to
control the layout of the coordiant genome transformation.
aes(y = ...) is requried.
aes(color = ) is used to mapping to data variables, if just pass
"color" without aes(), then will recycle the color to represent each
chromosomes.please see the example below.
Value
Return a ggplot object.
Author(s)
Tengfei Yin
Examples
## load
library(ggbio)
data(hg19IdeogramCyto, package = "biovizBase")
data(hg19Ideogram, package = "biovizBase")
library(GenomicRanges)
## simul_gr
library(biovizBase)
gr <- GRanges(rep(c("chr1", "chr2"), each = 5),
IRanges(start = rep(seq(1, 100, length = 5), times = 2),
width = 50))
autoplot(gr)
## coord:genome
autoplot(gr, coord = "genome")
gr.t <- transformToGenome(gr)
head(gr.t)
## is
is_coord_genome(gr.t)
metadata(gr.t)$coord
## simul_snp
chrs <- as.character(levels(seqnames(hg19IdeogramCyto)))
seqlths <- seqlengths(hg19Ideogram)[chrs]
set.seed(1)
nchr <- length(chrs)
nsnps <- 100
gr.snp <- GRanges(rep(chrs,each=nsnps),
IRanges(start =
do.call(c, lapply(chrs, function(chr){
N <- seqlths[chr]
runif(nsnps,1,N)
})), width = 1),
SNP=sapply(1:(nchr*nsnps), function(x) paste("rs",x,sep='')),
pvalue = -log10(runif(nchr*nsnps)),
group = sample(c("Normal", "Tumor"), size = nchr*nsnps,
replace = TRUE)
)
## shorter
seqlengths(gr.snp)
nms <- seqnames(seqinfo(gr.snp))
nms.new <- gsub("chr", "", nms)
names(nms.new) <- nms
gr.snp <- renameSeqlevels(gr.snp, nms.new)
seqlengths(gr.snp)
## unorder
autoplot(gr.snp, coord = "genome", geom = "point", aes(y = pvalue), space.skip = 0.01)
## sort
gr.snp <- keepSeqlevels(gr.snp, c(1:22, "X", "Y"))
autoplot(gr.snp, coord = "genome", geom = "point", aes(y = pvalue), space.skip = 0.01)
## with_seql
names(seqlths) <- gsub("chr", "", names(seqlths))
seqlengths(gr.snp) <- seqlths[names(seqlengths(gr.snp))]
autoplot(gr.snp, coord = "genome", geom = "point", aes(y = pvalue), space.skip = 0.01)
## line
autoplot(gr.snp, coord = "genome", geom = "line", aes(y = pvalue, group = seqnames,
color = seqnames))
## plotGrandLinear
plotGrandLinear(gr.snp, aes(y = pvalue))
## morecolor
plotGrandLinear(gr.snp, aes(y = pvalue, color = seqnames))
plotGrandLinear(gr.snp, aes(y = pvalue), color = c("green", "deepskyblue"))
plotGrandLinear(gr.snp, aes(y = pvalue), color = c("green", "deepskyblue", "red"))
plotGrandLinear(gr.snp, aes(y = pvalue), color = "red")
## cutoff
plotGrandLinear(gr.snp, aes(y = pvalue), cutoff = 3, cutoff.color = "blue", cutoff.size = 4)
## cutoff-low
plotGrandLinear(gr.snp, aes(y = pvalue)) + geom_hline(yintercept = 3, color = "blue", size = 4)
## longer
## let's make a long name
nms <- seqnames(seqinfo(gr.snp))
nms.new <- paste("chr00000", nms, sep = "")
names(nms.new) <- nms
gr.snp <- renameSeqlevels(gr.snp, nms.new)
seqlengths(gr.snp)
## rotate
plotGrandLinear(gr.snp, aes(y = pvalue)) + theme(axis.text.x=element_text(angle=-90, hjust=0))
## sessionInfo
sessionInfo()
Results
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(ggbio)
Loading required package: BiocGenerics
Loading required package: parallel
Attaching package: 'BiocGenerics'
The following objects are masked from 'package:parallel':
clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
clusterExport, clusterMap, parApply, parCapply, parLapply,
parLapplyLB, parRapply, parSapply, parSapplyLB
The following objects are masked from 'package:stats':
IQR, mad, xtabs
The following objects are masked from 'package:base':
Filter, Find, Map, Position, Reduce, anyDuplicated, append,
as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
rbind, rownames, sapply, setdiff, sort, table, tapply, union,
unique, unsplit
Loading required package: ggplot2
Need specific help about ggbio? try mailing
the maintainer or visit http://tengfei.github.com/ggbio/
Attaching package: 'ggbio'
The following objects are masked from 'package:ggplot2':
geom_bar, geom_rect, geom_segment, ggsave, stat_bin, stat_identity,
xlim
Warning message:
replacing previous import 'ggplot2::Position' by 'BiocGenerics::Position' when loading 'ggbio'
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/ggbio/plotGrandLinear.Rd_%03d_medium.png", width=480, height=480)
> ### Name: plotGrandLinear
> ### Title: Manhattan for GWAS
> ### Aliases: plotGrandLinear
>
> ### ** Examples
>
> ## load
> library(ggbio)
> data(hg19IdeogramCyto, package = "biovizBase")
> data(hg19Ideogram, package = "biovizBase")
> library(GenomicRanges)
Loading required package: S4Vectors
Loading required package: stats4
Attaching package: 'S4Vectors'
The following objects are masked from 'package:base':
colMeans, colSums, expand.grid, rowMeans, rowSums
Loading required package: IRanges
Loading required package: GenomeInfoDb
>
> ## simul_gr
> library(biovizBase)
> gr <- GRanges(rep(c("chr1", "chr2"), each = 5),
+ IRanges(start = rep(seq(1, 100, length = 5), times = 2),
+ width = 50))
> autoplot(gr)
>
> ## coord:genome
> autoplot(gr, coord = "genome")
using coord:genome to parse x scale
> gr.t <- transformToGenome(gr)
> head(gr.t)
GRanges object with 6 ranges and 2 metadata columns:
seqnames ranges strand | .start .end
<Rle> <IRanges> <Rle> | <numeric> <numeric>
[1] chr1 [ 1, 50] * | 1 50
[2] chr1 [ 25, 74] * | 25 74
[3] chr1 [ 50, 99] * | 50 99
[4] chr1 [ 75, 124] * | 75 124
[5] chr1 [100, 149] * | 100 149
[6] chr2 [ 1, 50] * | 180 229
-------
seqinfo: 2 sequences from an unspecified genome; no seqlengths
>
> ## is
> is_coord_genome(gr.t)
[1] TRUE
> metadata(gr.t)$coord
[1] "genome"
>
>
> ## simul_snp
> chrs <- as.character(levels(seqnames(hg19IdeogramCyto)))
> seqlths <- seqlengths(hg19Ideogram)[chrs]
> set.seed(1)
> nchr <- length(chrs)
> nsnps <- 100
> gr.snp <- GRanges(rep(chrs,each=nsnps),
+ IRanges(start =
+ do.call(c, lapply(chrs, function(chr){
+ N <- seqlths[chr]
+ runif(nsnps,1,N)
+ })), width = 1),
+ SNP=sapply(1:(nchr*nsnps), function(x) paste("rs",x,sep='')),
+ pvalue = -log10(runif(nchr*nsnps)),
+ group = sample(c("Normal", "Tumor"), size = nchr*nsnps,
+ replace = TRUE)
+ )
>
> ## shorter
> seqlengths(gr.snp)
chr1 chr10 chr11 chr12 chr13 chr14 chr15 chr16 chr17 chr18 chr19 chr2 chr20
NA NA NA NA NA NA NA NA NA NA NA NA NA
chr21 chr22 chr3 chr4 chr5 chr6 chr7 chr8 chr9 chrX chrY
NA NA NA NA NA NA NA NA NA NA NA
> nms <- seqnames(seqinfo(gr.snp))
> nms.new <- gsub("chr", "", nms)
> names(nms.new) <- nms
> gr.snp <- renameSeqlevels(gr.snp, nms.new)
> seqlengths(gr.snp)
1 10 11 12 13 14 15 16 17 18 19 2 20 21 22 3 4 5 6 7 8 9 X Y
NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
>
>
>
> ## unorder
> autoplot(gr.snp, coord = "genome", geom = "point", aes(y = pvalue), space.skip = 0.01)
using coord:genome to parse x scale
>
> ## sort
> gr.snp <- keepSeqlevels(gr.snp, c(1:22, "X", "Y"))
> autoplot(gr.snp, coord = "genome", geom = "point", aes(y = pvalue), space.skip = 0.01)
using coord:genome to parse x scale
>
> ## with_seql
> names(seqlths) <- gsub("chr", "", names(seqlths))
> seqlengths(gr.snp) <- seqlths[names(seqlengths(gr.snp))]
> autoplot(gr.snp, coord = "genome", geom = "point", aes(y = pvalue), space.skip = 0.01)
using coord:genome to parse x scale
>
> ## line
> autoplot(gr.snp, coord = "genome", geom = "line", aes(y = pvalue, group = seqnames,
+ color = seqnames))
using coord:genome to parse x scale
>
>
>
> ## plotGrandLinear
> plotGrandLinear(gr.snp, aes(y = pvalue))
using coord:genome to parse x scale
>
> ## morecolor
> plotGrandLinear(gr.snp, aes(y = pvalue, color = seqnames))
using coord:genome to parse x scale
> plotGrandLinear(gr.snp, aes(y = pvalue), color = c("green", "deepskyblue"))
using coord:genome to parse x scale
> plotGrandLinear(gr.snp, aes(y = pvalue), color = c("green", "deepskyblue", "red"))
using coord:genome to parse x scale
> plotGrandLinear(gr.snp, aes(y = pvalue), color = "red")
using coord:genome to parse x scale
>
> ## cutoff
> plotGrandLinear(gr.snp, aes(y = pvalue), cutoff = 3, cutoff.color = "blue", cutoff.size = 4)
using coord:genome to parse x scale
>
> ## cutoff-low
> plotGrandLinear(gr.snp, aes(y = pvalue)) + geom_hline(yintercept = 3, color = "blue", size = 4)
using coord:genome to parse x scale
>
> ## longer
> ## let's make a long name
> nms <- seqnames(seqinfo(gr.snp))
> nms.new <- paste("chr00000", nms, sep = "")
> names(nms.new) <- nms
> gr.snp <- renameSeqlevels(gr.snp, nms.new)
> seqlengths(gr.snp)
chr000001 chr000002 chr000003 chr000004 chr000005 chr000006 chr000007
249250621 243199373 198022430 191154276 180915260 171115067 159138663
chr000008 chr000009 chr0000010 chr0000011 chr0000012 chr0000013 chr0000014
146364022 141213431 135534747 135006516 133851895 115169878 107349540
chr0000015 chr0000016 chr0000017 chr0000018 chr0000019 chr0000020 chr0000021
102531392 90354753 81195210 78077248 59128983 63025520 48129895
chr0000022 chr00000X chr00000Y
51304566 155270560 59373566
>
> ## rotate
> plotGrandLinear(gr.snp, aes(y = pvalue)) + theme(axis.text.x=element_text(angle=-90, hjust=0))
using coord:genome to parse x scale
>
> ## sessionInfo
> sessionInfo()
R version 3.3.1 (2016-06-21)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04 LTS
locale:
[1] C
attached base packages:
[1] stats4 parallel stats graphics grDevices utils datasets
[8] methods base
other attached packages:
[1] biovizBase_1.20.0 GenomicRanges_1.24.2 GenomeInfoDb_1.8.1
[4] IRanges_2.6.1 S4Vectors_0.10.1 ggbio_1.20.1
[7] ggplot2_2.1.0 BiocGenerics_0.18.0
loaded via a namespace (and not attached):
[1] Rcpp_0.12.5 lattice_0.20-33
[3] Rsamtools_1.24.0 Biostrings_2.40.2
[5] digest_0.6.9 mime_0.4
[7] R6_2.1.2 plyr_1.8.4
[9] chron_2.3-47 acepack_1.3-3.3
[11] RSQLite_1.0.0 httr_1.2.0
[13] BiocInstaller_1.22.3 zlibbioc_1.18.0
[15] GenomicFeatures_1.24.3 data.table_1.9.6
[17] rpart_4.1-10 Matrix_1.2-6
[19] labeling_0.3 splines_3.3.1
[21] BiocParallel_1.6.2 AnnotationHub_2.4.2
[23] stringr_1.0.0 foreign_0.8-66
[25] RCurl_1.95-4.8 biomaRt_2.28.0
[27] munsell_0.4.3 shiny_0.13.2
[29] httpuv_1.3.3 rtracklayer_1.32.1
[31] htmltools_0.3.5 nnet_7.3-12
[33] SummarizedExperiment_1.2.3 gridExtra_2.2.1
[35] interactiveDisplayBase_1.10.3 Hmisc_3.17-4
[37] XML_3.98-1.4 reshape_0.8.5
[39] GenomicAlignments_1.8.3 bitops_1.0-6
[41] RBGL_1.48.1 grid_3.3.1
[43] xtable_1.8-2 GGally_1.1.0
[45] gtable_0.2.0 DBI_0.4-1
[47] magrittr_1.5 scales_0.4.0
[49] graph_1.50.0 stringi_1.1.1
[51] XVector_0.12.0 reshape2_1.4.1
[53] latticeExtra_0.6-28 Formula_1.2-1
[55] RColorBrewer_1.1-2 ensembldb_1.4.6
[57] tools_3.3.1 dichromat_2.0-0
[59] OrganismDbi_1.14.1 BSgenome_1.40.1
[61] Biobase_2.32.0 survival_2.39-5
[63] AnnotationDbi_1.34.3 colorspace_1.2-6
[65] cluster_2.0.4 VariantAnnotation_1.18.1
>
>
>
>
>
> dev.off()
null device
1
>