Last data update: 2014.03.03

R: Filter Association Test Results According to p-Values or...
filterResult-methodsR Documentation

Filter Association Test Results According to p-Values or Variants' Contributions

Description

Given an AssocTestResultRanges object, this method filters regions according to p-values or variants' contributions.

Usage

## S4 method for signature 'AssocTestResultRanges'
filterResult(object, cutoff=0.05,
             filterBy=c("p.value", "p.value.adj", "p.value.resampled",
             "p.value.resampled.adj"))
## S4 method for signature 'GRanges'
filterResult(object, cutoff=0.1)
## S4 method for signature 'GRangesList'
filterResult(object, cutoff=0.1)

Arguments

object

object of class AssocTestResultRanges, GRanges, or GRangesList

cutoff

threshold

filterBy

according to which p-value column filtering should be done; the default is “p.value”.

Details

If called for an AssocTestResultRanges object as first argument, this method strips off all regions the p-values of which exceed the threshold cutoff. By default, this filtering is applied to raw p-values (metadata column “p.value”). The filterBy argument allows for performing filtering on any of the other three p-value metadata columns (if available, otherwise the method quits with an error).

If called for a GRanges object as first argument, this method checks if the first argument object has a metadata column named “weight.contribution”. If it exists, it returns a GRanges object with the elements of object that have a value greater than cutoff in the “weight.contribution” metadata column. If this metadata column does not exist, the method quits with an error.

If called for a GRangesList object as first argument object, this method applies the filterResult method for each of its list components and returns a GRangesList object. If any of the components of object does not have a metadata column named “weight.contribution”, the method quits with an error.

Value

an object of class AssocTestResultRanges, GRanges, or GRangesList (see details above)

Author(s)

Ulrich Bodenhofer bodenhofer@bioinf.jku.at

References

http://www.bioinf.jku.at/software/podkat

See Also

AssocTestResultRanges, p.adjust

Examples

## load genome description
data(hgA)

## partition genome into overlapping windows
windows <- partitionRegions(hgA)

## load genotype data from VCF file
vcfFile <- system.file("examples/example1.vcf.gz", package="podkat")
Z <- readGenotypeMatrix(vcfFile)

## read phenotype data from CSV file (continuous trait + covariates)
phenoFile <- system.file("examples/example1lin.csv", package="podkat")
pheno <-read.table(phenoFile, header=TRUE, sep=",")

## train null model with all covariates in data frame 'pheno'
nm.lin <- nullModel(y ~ ., pheno)

## perform association test for multiple regions
res <- assocTest(Z, nm.lin, windows)
res.adj <- p.adjust(res, method="BH")

## show filtered results
res.f <- filterResult(res.adj)
print(res.f)
res.f <- filterResult(res.adj, filterBy="p.value.adj")
print(res.f)

## compute contributions
contrib <- weights(res.f, Z, nm.lin)
contrib

## extract most indicative variants
filterResult(contrib[[1]])
filterResult(contrib)

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(podkat)
Loading required package: Rsamtools
Loading required package: GenomeInfoDb
Loading required package: stats4
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: 'BiocGenerics'

The following objects are masked from 'package:parallel':

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from 'package:stats':

    IQR, mad, xtabs

The following objects are masked from 'package:base':

    Filter, Find, Map, Position, Reduce, anyDuplicated, append,
    as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
    get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
    match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
    rbind, rownames, sapply, setdiff, sort, table, tapply, union,
    unique, unsplit

Loading required package: S4Vectors

Attaching package: 'S4Vectors'

The following objects are masked from 'package:base':

    colMeans, colSums, expand.grid, rowMeans, rowSums

Loading required package: IRanges
Loading required package: GenomicRanges
Loading required package: Biostrings
Loading required package: XVector
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/podkat/filterResult-methods.Rd_%03d_medium.png", width=480, height=480)
> ### Name: filterResult-methods
> ### Title: Filter Association Test Results According to p-Values or
> ###   Variants' Contributions
> ### Aliases: filterResult filterResult-methods
> ###   filterResult,AssocTestResultRanges-method filterResult,GRanges-method
> ###   filterResult,GRangesList-method
> ### Keywords: methods
> 
> ### ** Examples
> 
> ## load genome description
> data(hgA)
> 
> ## partition genome into overlapping windows
> windows <- partitionRegions(hgA)
> 
> ## load genotype data from VCF file
> vcfFile <- system.file("examples/example1.vcf.gz", package="podkat")
> Z <- readGenotypeMatrix(vcfFile)
> 
> ## read phenotype data from CSV file (continuous trait + covariates)
> phenoFile <- system.file("examples/example1lin.csv", package="podkat")
> pheno <-read.table(phenoFile, header=TRUE, sep=",")
> 
> ## train null model with all covariates in data frame 'pheno'
> nm.lin <- nullModel(y ~ ., pheno)
> 
> ## perform association test for multiple regions
> res <- assocTest(Z, nm.lin, windows)
> res.adj <- p.adjust(res, method="BH")
> 
> ## show filtered results
> res.f <- filterResult(res.adj)
> print(res.f)
Overview of association test:
	Null model: linear 
	Number of samples: 200 
	Number of regions: 8 
	Number of regions without variants: 0 
	Average number of variants in regions: 24.9 
	Genome: hgA 
	Kernel: linear.podkat 
	p-value adjustment: BH 

Overview of significance of results:
	Number of tests with p < 0.05: 8
	Number of tests with adj. p < 0.05: 2

Results for the 8 most significant regions:
  seqnames  start    end width  n         Q      p.value  p.value.adj
1     chr1   7501  12500  5000 31 769748.34 1.294084e-07 1.022327e-05
2     chr1  10001  15000  5000 33 764828.81 4.874460e-06 1.925412e-04
3     chr1 140001 145000  5000 15  79937.68 3.599077e-03 9.477570e-02
4     chr1   5001  10000  5000 34 152555.30 9.785569e-03 1.932650e-01
5     chr1 132501 137500  5000 21  89287.55 1.349559e-02 2.132303e-01
6     chr1 142501 147500  5000 23  94629.68 3.338620e-02 3.707625e-01
7     chr1  42501  47500  5000 19  58191.23 3.341032e-02 3.707625e-01
8     chr1  25001  30000  5000 23 103713.12 3.754557e-02 3.707625e-01
> res.f <- filterResult(res.adj, filterBy="p.value.adj")
> print(res.f)
Overview of association test:
	Null model: linear 
	Number of samples: 200 
	Number of regions: 2 
	Number of regions without variants: 0 
	Average number of variants in regions: 32.0 
	Genome: hgA 
	Kernel: linear.podkat 
	p-value adjustment: BH 

Overview of significance of results:
	Number of tests with p < 0.05: 2
	Number of tests with adj. p < 0.05: 2

Results for the 2 most significant regions:
  seqnames start   end width  n        Q      p.value  p.value.adj
1     chr1  7501 12500  5000 31 769748.3 1.294084e-07 1.022327e-05
2     chr1 10001 15000  5000 33 764828.8 4.874460e-06 1.925412e-04
> 
> ## compute contributions
> contrib <- weights(res.f, Z, nm.lin)
> contrib
GRangesList object of length 2:
$chr1:7501-12500 
GRanges object with 31 ranges and 2 metadata columns:
          seqnames         ranges strand |        weight.raw
             <Rle>      <IRanges>  <Rle> |         <numeric>
  snv:160     chr1   [7713, 7713]      * | -44.4915214738852
  snv:164     chr1   [7834, 7834]      * | -40.8922243097158
  snv:166     chr1   [7932, 7932]      * | -43.5583181677057
  snv:167     chr1   [7976, 7976]      * | -44.3762021678535
  snv:177     chr1   [8342, 8342]      * | -56.9143639802579
      ...      ...            ...    ... .               ...
  snv:249     chr1 [11888, 11888]      * |   324.93127428678
  snv:256     chr1 [12191, 12191]      * |  189.297136566301
  snv:258     chr1 [12273, 12273]      * |  161.128284537927
  snv:261     chr1 [12307, 12307]      * |  144.858945312137
  snv:264     chr1 [12369, 12369]      * |  115.201751621058
           weight.contribution
                     <numeric>
  snv:160 0.000833988385587964
  snv:164 0.000704509666285363
  snv:166 0.000799369719837965
  snv:167 0.000829670694537205
  snv:177  0.00136473792799336
      ...                  ...
  snv:249    0.044482431119455
  snv:256    0.015097101560757
  snv:258   0.0109382804504061
  snv:261  0.00884089281254714
  snv:264   0.0055914397873343

...
<1 more element>
-------
seqinfo: 1 sequence from hgA genome
> 
> ## extract most indicative variants
> filterResult(contrib[[1]])
GRanges object with 3 ranges and 2 metadata columns:
          seqnames         ranges strand |       weight.raw weight.contribution
             <Rle>      <IRanges>  <Rle> |        <numeric>           <numeric>
  snv:237     chr1 [11387, 11387]      * | 494.787767873738   0.103143863855037
  snv:239     chr1 [11392, 11392]      * | 495.983773653802   0.103643107206047
  snv:240     chr1 [11402, 11402]      * |    494.259038387     0.1029235428705
  -------
  seqinfo: 1 sequence from hgA genome
> filterResult(contrib)
GRangesList object of length 2:
$chr1:7501-12500 
GRanges object with 3 ranges and 2 metadata columns:
          seqnames         ranges strand |       weight.raw weight.contribution
             <Rle>      <IRanges>  <Rle> |        <numeric>           <numeric>
  snv:237     chr1 [11387, 11387]      * | 494.787767873738   0.103143863855037
  snv:239     chr1 [11392, 11392]      * | 495.983773653802   0.103643107206047
  snv:240     chr1 [11402, 11402]      * |    494.259038387     0.1029235428705

$chr1:10001-15000 
GRanges object with 3 ranges and 2 metadata columns:
          seqnames         ranges strand |       weight.raw weight.contribution
  snv:237     chr1 [11387, 11387]      * | 494.787767873738   0.103807305956991
  snv:239     chr1 [11392, 11392]      * | 495.983773653802   0.104309760541765
  snv:240     chr1 [11402, 11402]      * |    494.259038387   0.103585567823516

-------
seqinfo: 1 sequence from hgA genome
> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>