Last data update: 2014.03.03

R: Annotation of genes affected by structural variations
geneAnnotationR Documentation

Annotation of genes affected by structural variations

Description

Report the details of genes affected by structural variations.

Usage

    geneAnnotation(structuralVariation,genomeAnnotation)

Arguments

structuralVariation

A data frame of structural variations.

genomeAnnotation

A genomic ranges of the genome annotation.

Details

A structural variation (deletion, duplication, inversion et al.) could affect the structure of a specific gene, including deletion of introns/exons, deletion of whole gene, et al.. And a specific gene might be affected by multiple SVs. This function gives the detailed effects caused by structural variations to genes and its elements from the point of genes.

The parameter "structuralVariation" should be a data frame with three columns:

  • chr the chromosome of a structural variation.

  • start the start coordinate of a structural variation.

  • end the end coordinate of a structural variation.

Value

A data frame with the following columns:

locus

the gene affected by structural variations.

exon

the effect of structural variations to exons of a specific gene.

intron

the effect of structural variations to introns of a specific gene.

cds

the effect of structural variations to cdss of a specific gene.

utr

the effect of structural variations to utrs of a specific gene.

Author(s)

Wen Yao

Examples

    breakdancer <- readBreakDancer(system.file("extdata/ZS97.breakdancer.sv",
                                   package="intansv"))
    str(breakdancer)

    load(system.file("extdata/genome.anno.RData",package="intansv"))
    str(msu_gff_v7)
    gene.breakdancer.anno <- llply(breakdancer,geneAnnotation,
                                   genomeAnnotation=msu_gff_v7)
    str(gene.breakdancer.anno)

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(intansv)
Loading required package: plyr
Loading required package: ggbio
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: 'BiocGenerics'

The following objects are masked from 'package:parallel':

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from 'package:stats':

    IQR, mad, xtabs

The following objects are masked from 'package:base':

    Filter, Find, Map, Position, Reduce, anyDuplicated, append,
    as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
    get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
    match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
    rbind, rownames, sapply, setdiff, sort, table, tapply, union,
    unique, unsplit

Loading required package: ggplot2
Need specific help about ggbio? try mailing 
 the maintainer or visit http://tengfei.github.com/ggbio/

Attaching package: 'ggbio'

The following objects are masked from 'package:ggplot2':

    geom_bar, geom_rect, geom_segment, ggsave, stat_bin, stat_identity,
    xlim

Loading required package: GenomicRanges
Loading required package: S4Vectors
Loading required package: stats4

Attaching package: 'S4Vectors'

The following object is masked from 'package:plyr':

    rename

The following objects are masked from 'package:base':

    colMeans, colSums, expand.grid, rowMeans, rowSums

Loading required package: IRanges

Attaching package: 'IRanges'

The following object is masked from 'package:plyr':

    desc

Loading required package: GenomeInfoDb
Warning message:
replacing previous import 'ggplot2::Position' by 'BiocGenerics::Position' when loading 'ggbio' 
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/intansv/geneAnnotation.Rd_%03d_medium.png", width=480, height=480)
> ### Name: geneAnnotation
> ### Title: Annotation of genes affected by structural variations
> ### Aliases: geneAnnotation
> 
> ### ** Examples
> 
>     breakdancer <- readBreakDancer(system.file("extdata/ZS97.breakdancer.sv",
+                                    package="intansv"))
>     str(breakdancer)
List of 2
 $ del:'data.frame':	1007 obs. of  4 variables:
  ..$ chromosome: chr [1:1007] "chr05" "chr05" "chr05" "chr05" ...
  ..$ pos1      : num [1:1007] 65586 86442 120288 153694 201845 ...
  ..$ pos2      : num [1:1007] 65938 87430 127670 153801 201959 ...
  ..$ size      : num [1:1007] 353 988 7382 107 114 ...
 $ inv:'data.frame':	17 obs. of  4 variables:
  ..$ chromosome: chr [1:17] "chr05" "chr05" "chr05" "chr05" ...
  ..$ pos1      : num [1:17] 1291574 6942451 12014581 15092428 18770962 ...
  ..$ pos2      : num [1:17] 1291678 6944637 12015915 15157500 18771414 ...
  ..$ size      : num [1:17] 105 2186 1334 65072 452 ...
 - attr(*, "method")= chr "BreakDancer"
> 
>     load(system.file("extdata/genome.anno.RData",package="intansv"))
>     str(msu_gff_v7)
Formal class 'GRanges' [package "GenomicRanges"] with 6 slots
  ..@ seqnames       :Formal class 'Rle' [package "IRanges"] with 4 slots
  .. .. ..@ values         : Factor w/ 2 levels "chr05","chr10": 1 2
  .. .. ..@ lengths        : int [1:2] 67758 45813
  .. .. ..@ elementMetadata: NULL
  .. .. ..@ metadata       : list()
  ..@ ranges         :Formal class 'IRanges' [package "IRanges"] with 6 slots
  .. .. ..@ start          : int [1:113571] 4003 4003 4003 4003 6935 6935 6935 7271 8050 6935 ...
  .. .. ..@ width          : int [1:113571] 354 354 354 354 2165 2165 234 408 1050 234 ...
  .. .. ..@ NAMES          : NULL
  .. .. ..@ elementType    : chr "integer"
  .. .. ..@ elementMetadata: NULL
  .. .. ..@ metadata       : list()
  ..@ strand         :Formal class 'Rle' [package "IRanges"] with 4 slots
  .. .. ..@ values         : Factor w/ 3 levels "+","-","*": 1 2 1 2 1 2 1 2 1 2 ...
  .. .. ..@ lengths        : int [1:4120] 12 4 4 38 29 76 12 188 76 32 ...
  .. .. ..@ elementMetadata: NULL
  .. .. ..@ metadata       : list()
  ..@ elementMetadata:Formal class 'DataFrame' [package "IRanges"] with 6 slots
  .. .. ..@ rownames       : NULL
  .. .. ..@ nrows          : int 113571
  .. .. ..@ listData       :List of 8
  .. .. .. ..$ source: Factor w/ 1 level "MSU_osa1r7": 1 1 1 1 1 1 1 1 1 1 ...
  .. .. .. ..$ type  : Factor w/ 6 levels "CDS","exon","five_prime_UTR",..: 4 5 2 1 4 5 2 2 2 1 ...
  .. .. .. ..$ score : num [1:113571] NA NA NA NA NA NA NA NA NA NA ...
  .. .. .. ..$ phase : int [1:113571] NA NA NA NA NA NA NA NA NA NA ...
  .. .. .. ..$ ID    : chr [1:113571] "LOC_Os05g00988" "LOC_Os05g00988.1" "LOC_Os05g00988.1:exon_1" "LOC_Os05g00988.1:cds_1" ...
  .. .. .. ..$ Name  : chr [1:113571] "LOC_Os05g00988" "LOC_Os05g00988.1" NA NA ...
  .. .. .. ..$ Note  :Formal class 'CompressedCharacterList' [package "IRanges"] with 5 slots
  .. .. .. .. .. ..@ elementType    : chr "character"
  .. .. .. .. .. ..@ elementMetadata: NULL
  .. .. .. .. .. ..@ metadata       : list()
  .. .. .. .. .. ..@ partitioning   :Formal class 'PartitioningByEnd' [package "IRanges"] with 5 slots
  .. .. .. .. .. .. .. ..@ end            : int [1:113571] 1 1 1 1 2 2 2 2 2 2 ...
  .. .. .. .. .. .. .. ..@ NAMES          : NULL
  .. .. .. .. .. .. .. ..@ elementType    : chr "integer"
  .. .. .. .. .. .. .. ..@ elementMetadata: NULL
  .. .. .. .. .. .. .. ..@ metadata       : list()
  .. .. .. .. .. ..@ unlistData     : chr [1:8082] "hypothetical protein" "retrotransposon protein, putative, unclassified, expressed" "expressed protein" "expressed protein" ...
  .. .. .. ..$ Parent:Formal class 'CompressedCharacterList' [package "IRanges"] with 5 slots
  .. .. .. .. .. ..@ elementType    : chr "character"
  .. .. .. .. .. ..@ elementMetadata: NULL
  .. .. .. .. .. ..@ metadata       : list()
  .. .. .. .. .. ..@ partitioning   :Formal class 'PartitioningByEnd' [package "IRanges"] with 5 slots
  .. .. .. .. .. .. .. ..@ end            : int [1:113571] 0 1 2 3 3 4 5 6 7 8 ...
  .. .. .. .. .. .. .. ..@ NAMES          : NULL
  .. .. .. .. .. .. .. ..@ elementType    : chr "integer"
  .. .. .. .. .. .. .. ..@ elementMetadata: NULL
  .. .. .. .. .. .. .. ..@ metadata       : list()
  .. .. .. .. .. ..@ unlistData     : chr [1:105489] "LOC_Os05g00988" "LOC_Os05g00988.1" "LOC_Os05g00988.1" "LOC_Os05g00990" ...
  .. .. ..@ elementType    : chr "ANY"
  .. .. ..@ elementMetadata: NULL
  .. .. ..@ metadata       : list()
  ..@ seqinfo        :Formal class 'Seqinfo' [package "GenomicRanges"] with 4 slots
  .. .. ..@ seqnames   : chr [1:2] "chr05" "chr10"
  .. .. ..@ seqlengths : int [1:2] NA NA
  .. .. ..@ is_circular: logi [1:2] NA NA
  .. .. ..@ genome     : chr [1:2] NA NA
  ..@ metadata       : list()
>     gene.breakdancer.anno <- llply(breakdancer,geneAnnotation,
+                                    genomeAnnotation=msu_gff_v7)
>     str(gene.breakdancer.anno)
List of 2
 $ del:'data.frame':	715 obs. of  5 variables:
  ..$ locus : Factor w/ 17634 levels "LOC_Os05g00988",..: 27 28 29 102 104 136 138 181 183 185 ...
  ..$ exon  : chr [1:715] "0:0:0:0:0:0:0:0.31" "0:0:0:0:0:0:0:0:0.46" "0:0:0:0:0:0:0:0:0" "0:0:0:0:0:0:0:0:0:0:0:0" ...
  ..$ intron: chr [1:715] "0:0:0:0:0:0:0" "0:0:0:0:0:0:0:0" "0:0:0:0:0:0:0.4:0" "0:0:0:0:0:0:0:0:0:0:0" ...
  ..$ cds   : chr [1:715] "0:0:0:0:0:0:0:0" "0:0:0:0:0:0:0:0" "0:0:0:0:0:0:0:0" "0:0:0:0:0:0:0:0:0:0:0:0" ...
  ..$ utr   : chr [1:715] "0.37:0" "0.46:0:0" "0:0:0" "0.01:0" ...
 $ inv:'data.frame':	78 obs. of  5 variables:
  ..$ locus : Factor w/ 17634 levels "LOC_Os05g00988",..: 519 2320 4817 4818 4820 4822 4824 4826 4830 4832 ...
  ..$ exon  : chr [1:78] "0.19:0:0:0:0:0" "1:1:1:1:0.41" "1:1:1:0.78" "1:1:1:0.83" ...
  ..$ intron: chr [1:78] "0:0:0:0:0" "1:1:1:1" "1:1:1" "1:1:1" ...
  ..$ cds   : chr [1:78] "0:0:0:0:0" "1:1:1:1:0.54" "1" "1" ...
  ..$ utr   : chr [1:78] "0.19:0:0" "0:1" "0.51:1:1:1:1" "0.59:1:1:1:1" ...
> 
> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>