Last data update: 2014.03.03

R: Manhattan for GWAS
plotGrandLinearR Documentation

Manhattan for GWAS

Description

A Manhattan plot is special scatter plot used to visualize data with a large number of data points, with a distribute of some higher-magnitude values. For example, in the GWAS(genome-wide association studies). Here we mainly focus on GWAS Manhattan plots. X-axis is genomic coordinates and Y-axis is negative logarithm of the associated P-value for each single nucleotide polymorphism. So higher the value, more stronger the association they are.

Usage

plotGrandLinear(obj, ..., facets, space.skip = 0.01, geom = NULL,
                 cutoff = NULL, cutoff.color = "red", cutoff.size = 1,
                 legend = FALSE, xlim, ylim, xlab, ylab, main,
                 highlight.gr = NULL, highlight.name = NULL,
                 highlight.col = "red", highlight.label = TRUE,
                 highlight.label.size = 5, highlight.label.offset =
                 0.05, highlight.label.col = "black", spaceline =
                 FALSE)

Arguments

obj

GRanges object which contains extra p value, before users pass this object, they need to make sure the pvalue has been changed to -log10(p).

...

extra arguments passed. such as color, size, alpha.

facets

facets formula, such as group ~ .

space.skip

numeric value for skip ratio, between chromosome spaces.default is 0.01.

geom

geometric object, defualt is "point".

cutoff

A numeric vector which used as cutoff for Manhattan plot.

cutoff.color

A character specifying the color used for cutoff. Default is "red".

cutoff.size

A numeric value which used as cutoff line size.

legend

A logical value indicate whether to show legend or not. Default is FALSE which disabled the legend.

xlim

limits for x scale.

ylim

limits for y scale.

xlab

Label for xscale.

ylab

Label for yscale.

main

title.

highlight.gr

a GRanges object, this wil highlight overlapped region with provided intervals.

highlight.name

if NULL, using rownames of GRanges object provided by argument highlight.gr, otherwise use character to indicate column used as labeled names.

highlight.col

highlight colors.

highlight.label

logical value, label the highlighted region of not.

highlight.label.size

highlight label size.

highlight.label.offset

highlight label offset.

highlight.label.col

highlight label color.

spaceline

show line between chromosomes.

Details

Please use seqlengths of the object and space.skip arguments to control the layout of the coordiant genome transformation.

aes(y = ...) is requried.

aes(color = ) is used to mapping to data variables, if just pass "color" without aes(), then will recycle the color to represent each chromosomes.please see the example below.

Value

Return a ggplot object.

Author(s)

Tengfei Yin

Examples

##  load
library(ggbio)
data(hg19IdeogramCyto, package = "biovizBase")
data(hg19Ideogram, package = "biovizBase")
library(GenomicRanges)

##  simul_gr
library(biovizBase)
gr <- GRanges(rep(c("chr1", "chr2"), each = 5),
              IRanges(start = rep(seq(1, 100, length = 5), times = 2),
                      width = 50))
autoplot(gr)

##  coord:genome
autoplot(gr, coord = "genome")
gr.t <- transformToGenome(gr)
head(gr.t)

##  is
is_coord_genome(gr.t)
metadata(gr.t)$coord


##  simul_snp
chrs <- as.character(levels(seqnames(hg19IdeogramCyto)))
seqlths <- seqlengths(hg19Ideogram)[chrs]
set.seed(1)
nchr <- length(chrs)
nsnps <- 100
gr.snp <- GRanges(rep(chrs,each=nsnps),
                  IRanges(start =
                          do.call(c, lapply(chrs, function(chr){
                            N <- seqlths[chr]
                            runif(nsnps,1,N)
                          })), width = 1),
                  SNP=sapply(1:(nchr*nsnps), function(x) paste("rs",x,sep='')),
                  pvalue =  -log10(runif(nchr*nsnps)),
                  group = sample(c("Normal", "Tumor"), size = nchr*nsnps,
                    replace = TRUE)
                  )

##  shorter
seqlengths(gr.snp)
nms <- seqnames(seqinfo(gr.snp))
nms.new <- gsub("chr", "", nms)
names(nms.new) <- nms
gr.snp <- renameSeqlevels(gr.snp, nms.new)
seqlengths(gr.snp)



##  unorder
autoplot(gr.snp, coord = "genome", geom = "point", aes(y = pvalue), space.skip = 0.01)

##  sort
gr.snp <- keepSeqlevels(gr.snp, c(1:22, "X", "Y"))
autoplot(gr.snp, coord = "genome", geom = "point", aes(y = pvalue), space.skip = 0.01)

##  with_seql
names(seqlths) <- gsub("chr", "", names(seqlths))
seqlengths(gr.snp) <- seqlths[names(seqlengths(gr.snp))]
autoplot(gr.snp, coord = "genome", geom = "point", aes(y = pvalue), space.skip = 0.01)

##  line
autoplot(gr.snp, coord = "genome", geom = "line", aes(y = pvalue, group = seqnames,
                                     color = seqnames))



##  plotGrandLinear
plotGrandLinear(gr.snp, aes(y = pvalue))

##  morecolor
plotGrandLinear(gr.snp, aes(y = pvalue, color = seqnames))
plotGrandLinear(gr.snp, aes(y = pvalue), color = c("green", "deepskyblue"))
plotGrandLinear(gr.snp, aes(y = pvalue), color = c("green", "deepskyblue", "red"))
plotGrandLinear(gr.snp, aes(y = pvalue), color = "red")

##  cutoff
plotGrandLinear(gr.snp, aes(y = pvalue), cutoff = 3, cutoff.color = "blue", cutoff.size = 4)

##  cutoff-low
plotGrandLinear(gr.snp, aes(y = pvalue)) + geom_hline(yintercept = 3, color = "blue", size = 4)

##  longer
## let's make a long name
nms <- seqnames(seqinfo(gr.snp))
nms.new <- paste("chr00000", nms, sep = "")
names(nms.new) <- nms
gr.snp <- renameSeqlevels(gr.snp, nms.new)
seqlengths(gr.snp)

##  rotate
plotGrandLinear(gr.snp, aes(y = pvalue)) + theme(axis.text.x=element_text(angle=-90, hjust=0))

##  sessionInfo
sessionInfo()

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(ggbio)
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: 'BiocGenerics'

The following objects are masked from 'package:parallel':

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from 'package:stats':

    IQR, mad, xtabs

The following objects are masked from 'package:base':

    Filter, Find, Map, Position, Reduce, anyDuplicated, append,
    as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
    get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
    match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
    rbind, rownames, sapply, setdiff, sort, table, tapply, union,
    unique, unsplit

Loading required package: ggplot2
Need specific help about ggbio? try mailing 
 the maintainer or visit http://tengfei.github.com/ggbio/

Attaching package: 'ggbio'

The following objects are masked from 'package:ggplot2':

    geom_bar, geom_rect, geom_segment, ggsave, stat_bin, stat_identity,
    xlim

Warning message:
replacing previous import 'ggplot2::Position' by 'BiocGenerics::Position' when loading 'ggbio' 
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/ggbio/plotGrandLinear.Rd_%03d_medium.png", width=480, height=480)
> ### Name: plotGrandLinear
> ### Title: Manhattan for GWAS
> ### Aliases: plotGrandLinear
> 
> ### ** Examples
> 
> ##  load
> library(ggbio)
> data(hg19IdeogramCyto, package = "biovizBase")
> data(hg19Ideogram, package = "biovizBase")
> library(GenomicRanges)
Loading required package: S4Vectors
Loading required package: stats4

Attaching package: 'S4Vectors'

The following objects are masked from 'package:base':

    colMeans, colSums, expand.grid, rowMeans, rowSums

Loading required package: IRanges
Loading required package: GenomeInfoDb
> 
> ##  simul_gr
> library(biovizBase)
> gr <- GRanges(rep(c("chr1", "chr2"), each = 5),
+               IRanges(start = rep(seq(1, 100, length = 5), times = 2),
+                       width = 50))
> autoplot(gr)
> 
> ##  coord:genome
> autoplot(gr, coord = "genome")
using coord:genome to parse x scale
> gr.t <- transformToGenome(gr)
> head(gr.t)
GRanges object with 6 ranges and 2 metadata columns:
      seqnames     ranges strand |    .start      .end
         <Rle>  <IRanges>  <Rle> | <numeric> <numeric>
  [1]     chr1 [  1,  50]      * |         1        50
  [2]     chr1 [ 25,  74]      * |        25        74
  [3]     chr1 [ 50,  99]      * |        50        99
  [4]     chr1 [ 75, 124]      * |        75       124
  [5]     chr1 [100, 149]      * |       100       149
  [6]     chr2 [  1,  50]      * |       180       229
  -------
  seqinfo: 2 sequences from an unspecified genome; no seqlengths
> 
> ##  is
> is_coord_genome(gr.t)
[1] TRUE
> metadata(gr.t)$coord
[1] "genome"
> 
> 
> ##  simul_snp
> chrs <- as.character(levels(seqnames(hg19IdeogramCyto)))
> seqlths <- seqlengths(hg19Ideogram)[chrs]
> set.seed(1)
> nchr <- length(chrs)
> nsnps <- 100
> gr.snp <- GRanges(rep(chrs,each=nsnps),
+                   IRanges(start =
+                           do.call(c, lapply(chrs, function(chr){
+                             N <- seqlths[chr]
+                             runif(nsnps,1,N)
+                           })), width = 1),
+                   SNP=sapply(1:(nchr*nsnps), function(x) paste("rs",x,sep='')),
+                   pvalue =  -log10(runif(nchr*nsnps)),
+                   group = sample(c("Normal", "Tumor"), size = nchr*nsnps,
+                     replace = TRUE)
+                   )
> 
> ##  shorter
> seqlengths(gr.snp)
 chr1 chr10 chr11 chr12 chr13 chr14 chr15 chr16 chr17 chr18 chr19  chr2 chr20 
   NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA 
chr21 chr22  chr3  chr4  chr5  chr6  chr7  chr8  chr9  chrX  chrY 
   NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA 
> nms <- seqnames(seqinfo(gr.snp))
> nms.new <- gsub("chr", "", nms)
> names(nms.new) <- nms
> gr.snp <- renameSeqlevels(gr.snp, nms.new)
> seqlengths(gr.snp)
 1 10 11 12 13 14 15 16 17 18 19  2 20 21 22  3  4  5  6  7  8  9  X  Y 
NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 
> 
> 
> 
> ##  unorder
> autoplot(gr.snp, coord = "genome", geom = "point", aes(y = pvalue), space.skip = 0.01)
using coord:genome to parse x scale
> 
> ##  sort
> gr.snp <- keepSeqlevels(gr.snp, c(1:22, "X", "Y"))
> autoplot(gr.snp, coord = "genome", geom = "point", aes(y = pvalue), space.skip = 0.01)
using coord:genome to parse x scale
> 
> ##  with_seql
> names(seqlths) <- gsub("chr", "", names(seqlths))
> seqlengths(gr.snp) <- seqlths[names(seqlengths(gr.snp))]
> autoplot(gr.snp, coord = "genome", geom = "point", aes(y = pvalue), space.skip = 0.01)
using coord:genome to parse x scale
> 
> ##  line
> autoplot(gr.snp, coord = "genome", geom = "line", aes(y = pvalue, group = seqnames,
+                                      color = seqnames))
using coord:genome to parse x scale
> 
> 
> 
> ##  plotGrandLinear
> plotGrandLinear(gr.snp, aes(y = pvalue))
using coord:genome to parse x scale
> 
> ##  morecolor
> plotGrandLinear(gr.snp, aes(y = pvalue, color = seqnames))
using coord:genome to parse x scale
> plotGrandLinear(gr.snp, aes(y = pvalue), color = c("green", "deepskyblue"))
using coord:genome to parse x scale
> plotGrandLinear(gr.snp, aes(y = pvalue), color = c("green", "deepskyblue", "red"))
using coord:genome to parse x scale
> plotGrandLinear(gr.snp, aes(y = pvalue), color = "red")
using coord:genome to parse x scale
> 
> ##  cutoff
> plotGrandLinear(gr.snp, aes(y = pvalue), cutoff = 3, cutoff.color = "blue", cutoff.size = 4)
using coord:genome to parse x scale
> 
> ##  cutoff-low
> plotGrandLinear(gr.snp, aes(y = pvalue)) + geom_hline(yintercept = 3, color = "blue", size = 4)
using coord:genome to parse x scale
> 
> ##  longer
> ## let's make a long name
> nms <- seqnames(seqinfo(gr.snp))
> nms.new <- paste("chr00000", nms, sep = "")
> names(nms.new) <- nms
> gr.snp <- renameSeqlevels(gr.snp, nms.new)
> seqlengths(gr.snp)
 chr000001  chr000002  chr000003  chr000004  chr000005  chr000006  chr000007 
 249250621  243199373  198022430  191154276  180915260  171115067  159138663 
 chr000008  chr000009 chr0000010 chr0000011 chr0000012 chr0000013 chr0000014 
 146364022  141213431  135534747  135006516  133851895  115169878  107349540 
chr0000015 chr0000016 chr0000017 chr0000018 chr0000019 chr0000020 chr0000021 
 102531392   90354753   81195210   78077248   59128983   63025520   48129895 
chr0000022  chr00000X  chr00000Y 
  51304566  155270560   59373566 
> 
> ##  rotate
> plotGrandLinear(gr.snp, aes(y = pvalue)) + theme(axis.text.x=element_text(angle=-90, hjust=0))
using coord:genome to parse x scale
> 
> ##  sessionInfo
> sessionInfo()
R version 3.3.1 (2016-06-21)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04 LTS

locale:
[1] C

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets 
[8] methods   base     

other attached packages:
[1] biovizBase_1.20.0    GenomicRanges_1.24.2 GenomeInfoDb_1.8.1  
[4] IRanges_2.6.1        S4Vectors_0.10.1     ggbio_1.20.1        
[7] ggplot2_2.1.0        BiocGenerics_0.18.0 

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.5                   lattice_0.20-33              
 [3] Rsamtools_1.24.0              Biostrings_2.40.2            
 [5] digest_0.6.9                  mime_0.4                     
 [7] R6_2.1.2                      plyr_1.8.4                   
 [9] chron_2.3-47                  acepack_1.3-3.3              
[11] RSQLite_1.0.0                 httr_1.2.0                   
[13] BiocInstaller_1.22.3          zlibbioc_1.18.0              
[15] GenomicFeatures_1.24.3        data.table_1.9.6             
[17] rpart_4.1-10                  Matrix_1.2-6                 
[19] labeling_0.3                  splines_3.3.1                
[21] BiocParallel_1.6.2            AnnotationHub_2.4.2          
[23] stringr_1.0.0                 foreign_0.8-66               
[25] RCurl_1.95-4.8                biomaRt_2.28.0               
[27] munsell_0.4.3                 shiny_0.13.2                 
[29] httpuv_1.3.3                  rtracklayer_1.32.1           
[31] htmltools_0.3.5               nnet_7.3-12                  
[33] SummarizedExperiment_1.2.3    gridExtra_2.2.1              
[35] interactiveDisplayBase_1.10.3 Hmisc_3.17-4                 
[37] XML_3.98-1.4                  reshape_0.8.5                
[39] GenomicAlignments_1.8.3       bitops_1.0-6                 
[41] RBGL_1.48.1                   grid_3.3.1                   
[43] xtable_1.8-2                  GGally_1.1.0                 
[45] gtable_0.2.0                  DBI_0.4-1                    
[47] magrittr_1.5                  scales_0.4.0                 
[49] graph_1.50.0                  stringi_1.1.1                
[51] XVector_0.12.0                reshape2_1.4.1               
[53] latticeExtra_0.6-28           Formula_1.2-1                
[55] RColorBrewer_1.1-2            ensembldb_1.4.6              
[57] tools_3.3.1                   dichromat_2.0-0              
[59] OrganismDbi_1.14.1            BSgenome_1.40.1              
[61] Biobase_2.32.0                survival_2.39-5              
[63] AnnotationDbi_1.34.3          colorspace_1.2-6             
[65] cluster_2.0.4                 VariantAnnotation_1.18.1     
> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>