Last data update: 2014.03.03

R: Query Ensembl Variant Effect Predictor
ensemblVEPR Documentation

Query Ensembl Variant Effect Predictor

Description

Retrieve variant annotation data from the Ensembl Variant Effect Predictor (VEP).

Usage

## S4 method for signature 'character'
ensemblVEP(file, param=VEPParam(), ...)

Arguments

file

A character specifying the full path to the file, including the file name.

Valid input file types are described on the Ensembl VEP web page. http://www.ensembl.org/info/docs/variation/vep/vep_script.html#running

param

An instance of VEPParam specifying runtime options.

...

Additional arguments passed to methods.

Details

The Ensembl VEP tool is described in detail on the home page (link in 'see also' section). The ensemblVEP function wraps the perl API and requires a local install of the Ensembl VEP available in the user's path. The VEPParam class provides a way to specify runtime options. Results are returned from Ensembl VEP as GRanges (default) or VCF objects. Alternatively, results can be written directly to a file.

Value

Default behavior returns a GRanges object. Options can be set to return a VCF object or write a file to disk.

Author(s)

Valerie Obenchain

References

Ensembl VEP Home: http://www.ensembl.org/info/docs/tools/vep/index.html

Human Genome Variation Society (hgvs): http://www.hgvs.org/mutnomen/

See Also

VEPParam-class

Examples

  ## -----------------------------------------------------------------------
  ## Results returned as GRanges or VCF objects
  ## -----------------------------------------------------------------------
  ## The default behavior returns a GRanges with the consequence
  ## data as metadata columns.
  file <- system.file("extdata", "ex2.vcf", package="VariantAnnotation") 
  ## Not run: 
  gr <- ensemblVEP(file)
  gr[1:3]
  
## End(Not run)
  ## When the 'vcf' option is TRUE, a VCF object is returned.
  myparam <- VEPParam(dataformat=c(vcf=TRUE))
  vcf <- ensemblVEP(file, param=myparam)
  vcf
 
  ## The consequence data are returned as the 'CSQ' column in info.
  info(vcf)$CSQ
 
  ## To parse this column use parseCSQToGRanges().
  csq <- parseCSQToGRanges(vcf)
  head(csq, 4)
 
  ## The columns returned are controlled by the 'fields' option. 
  ## By default all fields are returned. See ?VEPParam for details.
 
  ## When comparing ensemblVEP() results to the data in the
  ## input vcf we see variant 20:1230237 was not returned.
  vcf_input <- readVcf(file, "hg19")
  rowRanges(vcf_input)
  rowRanges(vcf)
 
  ## This variant has no alternate allele and is called a
  ## monomorphic reference. The Ensembl VEP automatically
  ## drops these variants. 
  rowRanges(vcf)[,c("REF", "ALT")]
 
  ## -----------------------------------------------------------------------
  ## Results written to disk
  ## -----------------------------------------------------------------------
  ## Write a file to disk by providing a path and file name as 'output_file'.
  ## Different output file formats are specified using the 'dataformat' 
  ## runtime options.
 
  ## Write a vcf file to myfile.vcf:
  myparam <- VEPParam(dataformat=c(vcf=TRUE), 
                      input=c(output_file="/path/myfile.vcf"))
  ## Write a gvf file to myfile.gvf:
  myparam <- VEPParam(dataformat=c(gvf=TRUE), 
                      input=c(output_file="/path/myfile.gvf"))
 
  ## -----------------------------------------------------------------------
  ## Runtime options
  ## -----------------------------------------------------------------------
  ## All runtime options are controlled by specifying a VEPParam.
  ## See ?VEPParam for complete details.
  param <- VEPParam()
 
  ## Logical options are turned on/off with TRUE/FALSE. By
  ## default, 'quiet' is FALSE.
  basic(param)$quiet
 
  ## Setting 'quiet' to TRUE will suppress all status and warnings.
  basic(param)$quiet <- TRUE
 
  ## Characater options are turned on/off by specifying a character 
  ## value or an empty character (i.e., character()). By default no 
  ## 'sift' results are returned.
  output(param)$sift
 
  ## Setting 'sift' to 'b' will return both predictions and scores.
  output(param)$sift <- 'b'
 
  ## Return 'sift' to the original state of no results returned.
  output(param)$sift <- character() 

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(ensemblVEP)
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: 'BiocGenerics'

The following objects are masked from 'package:parallel':

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from 'package:stats':

    IQR, mad, xtabs

The following objects are masked from 'package:base':

    Filter, Find, Map, Position, Reduce, anyDuplicated, append,
    as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
    get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
    match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
    rbind, rownames, sapply, setdiff, sort, table, tapply, union,
    unique, unsplit

Loading required package: GenomicRanges
Loading required package: S4Vectors
Loading required package: stats4

Attaching package: 'S4Vectors'

The following objects are masked from 'package:base':

    colMeans, colSums, expand.grid, rowMeans, rowSums

Loading required package: IRanges
Loading required package: GenomeInfoDb
Loading required package: VariantAnnotation
Loading required package: SummarizedExperiment
Loading required package: Biobase
Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.

Loading required package: Rsamtools
Loading required package: Biostrings
Loading required package: XVector

Attaching package: 'VariantAnnotation'

The following object is masked from 'package:base':

    tabulate

variant_effect_predictor.pl not found. Ensembl VEP is not installed in your path.

Attaching package: 'ensemblVEP'

The following object is masked from 'package:Biobase':

    cache

> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/ensemblVEP/ensemblVEP.Rd_%03d_medium.png", width=480, height=480)
> ### Name: ensemblVEP
> ### Title: Query Ensembl Variant Effect Predictor
> ### Aliases: ensemblVEP ensemblVEP,character-method
> ### Keywords: methods
> 
> ### ** Examples
> 
>   ## -----------------------------------------------------------------------
>   ## Results returned as GRanges or VCF objects
>   ## -----------------------------------------------------------------------
>   ## The default behavior returns a GRanges with the consequence
>   ## data as metadata columns.
>   file <- system.file("extdata", "ex2.vcf", package="VariantAnnotation") 
>   ## Not run: 
> ##D   gr <- ensemblVEP(file)
> ##D   gr[1:3]
> ##D   
> ## End(Not run)
>   ## When the 'vcf' option is TRUE, a VCF object is returned.
>   myparam <- VEPParam(dataformat=c(vcf=TRUE))
>   vcf <- ensemblVEP(file, param=myparam)
Error in .getVepPath() : 
  Couldn't find variant_effect_predictor.pl in your PATH.
Calls: ensemblVEP -> ensemblVEP -> .getVepPath
Execution halted