Last data update: 2014.03.03

R: Calculate heavy labeled peptides
calculateHeavyLabelsR Documentation

Calculate heavy labeled peptides

Description

A function to calculate heavy labeled peptides for proteins stored in a Proteins object.

Usage

calculateHeavyLabels(proteins, peptides, maxN = 20L, nN = 4L, nC = 3L,
  endsWith = c("K", "R", "G"), ...)

Arguments

proteins

A Proteins object.

peptides

A named character vector containing the peptides of interest. The names must match the UniProt accession numbers of the proteins in object.

maxN

An integer, maximal length of the heavy labeled peptide.

nN

An integer, minimal number of amino acids at the N terminus.

nC

An integer, minimal number of amino acids at the C terminus.

endsWith

A character vector containing the allowed amino acids at the end of the resulting sequence (every peptide that doesn't end with one of these amino acids has to be one amino acid shorter as maxN).

...

Additional parameters passed to .addOverhangs.

Details

The digestion efficiency with enzymes like trypsin is below 100%. That's why spiked-in peptides for labeled quantitation have to follow the same digestion rules as the peptides of interest. Therefore it is necessary to extend the peptides of interest by a few amino acids on the N- and C-terminus. These extensions should not be a cleavage point of the used enzym. This methods provides an easy interface to find the sequences for heavy labeled peptides that could be used as spike-ins for the peptides of interest. Please see the references for a more detailed discussion.

TODO: There should be a function to find the best labels for a given protein automatically.

Value

A data.frame with 6 columns:

  • ProteinThe Protein accession number.

  • PeptideThe peptide of interest.

  • N_overhangThe added sequence of the N-terminus.

  • C_overhangThe added sequence of the C-terminus.

  • spikeTideResultA short description of the used creation rule.

  • spikeTideThe heavy labeled peptide that represents the peptide of interest best.

Author(s)

Sebastian Gibb <mail@sebastiangibb.de> and Pavel Shliaha

References

The complete description of the issue: https://github.com/sgibb/cleaver/issues/5

Kito, Keiji, et al. A synthetic protein approach toward accurate mass spectrometric quantification of component stoichiometry of multiprotein complexes. Journal of proteome research 6.2 (2007): 792-800. http://dx.doi.org/10.1021/pr060447s

Examples

## example protein database
data(p, package = "Pbase")

## digest proteins into peptides
cleavedProteins <- cleave(p)

## find spike-ins for the peptides of interest
calculateHeavyLabels(cleavedProteins,
                      peptides = c(A4UGR9 = "MEGFHIK",
                                   A4UGR9 = "QGNMYTLSK",
                                   A6H8Y1 = "GSTASNPQR"))

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(Pbase)
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: 'BiocGenerics'

The following objects are masked from 'package:parallel':

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from 'package:stats':

    IQR, mad, xtabs

The following objects are masked from 'package:base':

    Filter, Find, Map, Position, Reduce, anyDuplicated, append,
    as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
    get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
    match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
    rbind, rownames, sapply, setdiff, sort, table, tapply, union,
    unique, unsplit

Loading required package: Rcpp
Loading required package: Gviz
Loading required package: S4Vectors
Loading required package: stats4

Attaching package: 'S4Vectors'

The following objects are masked from 'package:base':

    colMeans, colSums, expand.grid, rowMeans, rowSums

Loading required package: IRanges
Loading required package: GenomicRanges
Loading required package: GenomeInfoDb
Loading required package: grid

This is Pbase version 0.12.2

> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/Pbase/calculateHeavyLabels.Rd_%03d_medium.png", width=480, height=480)
> ### Name: calculateHeavyLabels
> ### Title: Calculate heavy labeled peptides
> ### Aliases: calculateHeavyLabels
> 
> ### ** Examples
> 
> ## example protein database
> data(p, package = "Pbase")
> 
> ## digest proteins into peptides
> cleavedProteins <- cleave(p)
> 
> ## find spike-ins for the peptides of interest
> calculateHeavyLabels(cleavedProteins,
+                       peptides = c(A4UGR9 = "MEGFHIK",
+                                    A4UGR9 = "QGNMYTLSK",
+                                    A6H8Y1 = "GSTASNPQR"))
  Protein   Peptide N_overhang C_overhang      spikeTideResult
1  A4UGR9   MEGFHIK       DHQK        SPK fully_representative
2  A4UGR9 QGNMYTLSK       AAPR         DS fully_representative
3  A6H8Y1 GSTASNPQR       VGAR       GRES fully_representative
            spikeTide
1    DHQK.MEGFHIK.SPK
2   AAPR.QGNMYTLSK.DS
3 VGAR.GSTASNPQR.GRES
> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>