Last data update: 2014.03.03

R: Output the variant(SNVs) protein sequences into FASTA format
OutputVarproseq_singleR Documentation

Output the variant(SNVs) protein sequences into FASTA format

Description

Output the non-synonymous SNVs into FASTA file, one SNV per sequence.

Usage

  OutputVarproseq_single(vartable, proteinseq, outfile,
    ids, lablersid = FALSE, RPKM = NULL, ...)

Arguments

vartable

A data frame which is the output of aaVariation().

proteinseq

A dataframe containing protein ids and the protein sequence.

outfile

Output file name.

ids

A dataframe containing gene/transcript/protein id mapping information.

lablersid

If includes the dbSNP rsid in the header of each sequence, default is FALSE. Must provide dbSNP information in function Positionincoding() if put TRUE here.

RPKM

If includes the RPKM value in the header of each sequence. default is NULL.

...

Additional arguments

Details

This function uses the output of aaVariation() as input, introduces the nonsynonymous variation into the protein database. If a protein have more than one SNVs, introduce one SNV each time, end up with equal number of sequences.

Value

FASTA file containing proteins with single nucleotide variation.

Author(s)

Xiaojing Wang

Examples

vcffile <- system.file("extdata/vcfs", "test1.vcf", package="customProDB")
vcf <- InputVcf(vcffile)
table(values(vcf[[1]])[['INDEL']])
index <- which(values(vcf[[1]])[['INDEL']] == FALSE)
SNVvcf <- vcf[[1]][index]
load(system.file("extdata/refseq", "exon_anno.RData",
package="customProDB"))
load(system.file("extdata/refseq", "dbsnpinCoding.RData",
    package="customProDB"))
load(system.file("extdata/refseq", "procodingseq.RData",
    package="customProDB"))
load(system.file("extdata/refseq", "ids.RData", package="customProDB"))
load(system.file("extdata/refseq", "proseq.RData", package="customProDB"))
postable_snv <- Positionincoding(SNVvcf, exon, dbsnpinCoding)
txlist <- unique(postable_snv[, 'txid'])
codingseq <- procodingseq[procodingseq[, 'tx_id'] %in% txlist, ]
mtab <- aaVariation (postable_snv, codingseq)
outfile <- paste(tempdir(), '/test_snv_single.fasta',sep='')
OutputVarproseq_single(mtab, proteinseq, outfile, ids, lablersid=TRUE)

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(customProDB)
Loading required package: IRanges
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: 'BiocGenerics'

The following objects are masked from 'package:parallel':

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from 'package:stats':

    IQR, mad, xtabs

The following objects are masked from 'package:base':

    Filter, Find, Map, Position, Reduce, anyDuplicated, append,
    as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
    get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
    match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
    rbind, rownames, sapply, setdiff, sort, table, tapply, union,
    unique, unsplit

Loading required package: S4Vectors
Loading required package: stats4

Attaching package: 'S4Vectors'

The following objects are masked from 'package:base':

    colMeans, colSums, expand.grid, rowMeans, rowSums

Loading required package: AnnotationDbi
Loading required package: Biobase
Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.

Loading required package: biomaRt
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/customProDB/OutputVarproseq_single.Rd_%03d_medium.png", width=480, height=480)
> ### Name: OutputVarproseq_single
> ### Title: Output the variant(SNVs) protein sequences into FASTA format
> ### Aliases: OutputVarproseq_single
> 
> ### ** Examples
> 
> vcffile <- system.file("extdata/vcfs", "test1.vcf", package="customProDB")
> vcf <- InputVcf(vcffile)
> table(values(vcf[[1]])[['INDEL']])

FALSE  TRUE 
   54     7 
> index <- which(values(vcf[[1]])[['INDEL']] == FALSE)
> SNVvcf <- vcf[[1]][index]
> load(system.file("extdata/refseq", "exon_anno.RData",
+ package="customProDB"))
> load(system.file("extdata/refseq", "dbsnpinCoding.RData",
+     package="customProDB"))
> load(system.file("extdata/refseq", "procodingseq.RData",
+     package="customProDB"))
> load(system.file("extdata/refseq", "ids.RData", package="customProDB"))
> load(system.file("extdata/refseq", "proseq.RData", package="customProDB"))
> postable_snv <- Positionincoding(SNVvcf, exon, dbsnpinCoding)
> txlist <- unique(postable_snv[, 'txid'])
> codingseq <- procodingseq[procodingseq[, 'tx_id'] %in% txlist, ]
> mtab <- aaVariation (postable_snv, codingseq)
> outfile <- paste(tempdir(), '/test_snv_single.fasta',sep='')
> OutputVarproseq_single(mtab, proteinseq, outfile, ids, lablersid=TRUE)
> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>