This GRanges instance includes locations for 297000 1000 genomes SNP that have been identified
as exhibiting LD with NHGRI GWAS SNP as of September 2013. The tagid field tells the name
of the tagging SNP, the baseid field is the SNP identifier for the GWAS catalog entry,
the R2 field tells the value of R-squared relating the distributions of the tagging SNP
and the GWAS entry. Only tagging SNP with R-squared 0.8 or greater are included.
A self-contained R-based procedure should emerge in 2014.
Source
NHGRI GWAS catalog; plink is used with the 1000 genomes VCF in a perl routine
by Michael McGeachie, Harvard Medical School;
Examples
data(gwastagger)
gwastagger[1:5]
data(ebicat37)
mean(ebicat37$SNPS %in% gwastagger$baseid)
# ideally, all GWAS SNP would be in our tagging ranges as baseid
query <- setdiff(ebicat37$SNPS, gwastagger$baseid)
# relatively recent catalog additions
sort(table(ebicat37[which(ebicat37$SNPS %in% query)]$DATE.ADDED.TO.CATALOG), decreasing=TRUE)[1:10]
Results
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(gwascat)
Loading required package: Homo.sapiens
Loading required package: AnnotationDbi
Loading required package: stats4
Loading required package: BiocGenerics
Loading required package: parallel
Attaching package: 'BiocGenerics'
The following objects are masked from 'package:parallel':
clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
clusterExport, clusterMap, parApply, parCapply, parLapply,
parLapplyLB, parRapply, parSapply, parSapplyLB
The following objects are masked from 'package:stats':
IQR, mad, xtabs
The following objects are masked from 'package:base':
Filter, Find, Map, Position, Reduce, anyDuplicated, append,
as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
rbind, rownames, sapply, setdiff, sort, table, tapply, union,
unique, unsplit
Loading required package: Biobase
Welcome to Bioconductor
Vignettes contain introductory material; view with
'browseVignettes()'. To cite Bioconductor, see
'citation("Biobase")', and for packages 'citation("pkgname")'.
Loading required package: IRanges
Loading required package: S4Vectors
Attaching package: 'S4Vectors'
The following objects are masked from 'package:base':
colMeans, colSums, expand.grid, rowMeans, rowSums
Loading required package: OrganismDbi
Loading required package: GenomicFeatures
Loading required package: GenomeInfoDb
Loading required package: GenomicRanges
Loading required package: GO.db
Loading required package: org.Hs.eg.db
Loading required package: TxDb.Hsapiens.UCSC.hg19.knownGene
gwascat loaded. Use data(ebicat38) for hg38 coordinates;
data(ebicat37) for hg19 coordinates.
Warning message:
replacing previous import 'ggplot2::Position' by 'BiocGenerics::Position' when loading 'ggbio'
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/gwascat/gwastagger.Rd_%03d_medium.png", width=480, height=480)
> ### Name: gwastagger
> ### Title: data on 1000 genomes SNPs that 'tag' GWAS catalog entries
> ### Aliases: gwastagger
> ### Keywords: datasets
>
> ### ** Examples
>
> data(gwastagger)
> gwastagger[1:5]
GRanges object with 5 ranges and 3 metadata columns:
seqnames ranges strand | tagid R2 baseid
<Rle> <IRanges> <Rle> | <character> <numeric> <character>
[1] chr1 [986111, 986111] * | rs28479311 0.938021 rs3934834
[2] chr1 [988364, 988364] * | rs3813193 0.993718 rs3934834
[3] chr1 [992250, 992250] * | chr1:992250 0.969160 rs3934834
[4] chr1 [992402, 992402] * | rs60442576 1.000000 rs3934834
[5] chr1 [995669, 995669] * | rs3934834 1.000000 rs3934834
-------
seqinfo: 24 sequences from 2 genomes (hg19, NA)
> data(ebicat37)
> mean(ebicat37$SNPS %in% gwastagger$baseid)
[1] 0.6136283
> # ideally, all GWAS SNP would be in our tagging ranges as baseid
> query <- setdiff(ebicat37$SNPS, gwastagger$baseid)
> # relatively recent catalog additions
> sort(table(ebicat37[which(ebicat37$SNPS %in% query)]$DATE.ADDED.TO.CATALOG), decreasing=TRUE)[1:10]
[1] NA NA NA NA NA NA NA NA NA NA
>
>
>
>
>
> dev.off()
null device
1
>