R Graphical Manual

Browse All

Last data update: 2014.03.03

R: Class "Synapter"

Synapter

R Documentation

Class "Synapter"

Description

A reference class to store, manage and process Synapt G2 data to combine identification and quantitation results.

The data, intermediate and final results are stored together in such a ad-how container called a class. In the frame of the analysis of a set of 3 data files, namely as identification peptide, a quantitation peptide and a quantitation Pep3D, such a container is created and populated, updated according to the user's instructions and used to display and export results.

The functionality of the synapter package implemented in the Synapter class in described in the Details section below. Documentation for the individual methods is provided in the Methods section. Finally, a complete example of an analysis is provided in the Examples section, at the end of this document.

See also papers by Shliaha et al. for details about ion mobility separation and the manuscript describing the synapter methodology.

Usage

Synapter(filenames, master) ## creates an instance of class 'Synapter'

Arguments

filenames

A named list of file names to be load. The names must be 'identpeptide', 'quantpeptide', 'quantpep3d' and 'fasta'. If missing, dialog boxes pop up to select the four files manually. identpeptide can be a csv final peptide file (from PLGS) or a saved "MasterPeptides" data object as created by makeMaster if working with master peptide data. To serialise the "MasterPeptides" instance, use the saveRDS function, and file extenstion rds.

master

A logical that defines if the identification file is a master file. See makeMaster for details about this strategy.

Details

A Synapter object logs every operation that is applied to it. When displayed with show or when the name of the instance is typed at the R console, the original input file names, all operations and resulting the size of the respective data are displayed. This allows the user to trace the effect of respective operations.

Loading the data

The construction of the data and analysis container, technically defined as an instance or object of class Synapter, is created with the Synapter constructor. This function opens four dialog boxes for the user to point to the input files, namely (and in that order), the identification final peptide csv file, the quantitation final peptide csv file and the quantitation Pep3D csv file (as exported from the PLGS software) and the fasta file use for peptide identification. The files are read and the data is stored in the newly created Synapter instance. The file names can also be specified as a named list with names 'identpeptide', 'quantpeptide' and 'quantpep3d' respectively.

The final peptide files are filtered to retain peptides with matchType corresponding to PepFrag1 and PepFrag2, corresponding to unmodified round 1 and 2 peptide identification. Other types, like NeutralLoss_NH3, NeutralLoss_H20, InSource, MissedCleavage or VarMod are not considered in the rest of the analysis. The quantitation Pep3D data is filtered to retain Function equal to 1 and unique quantitation spectrum ids, i.e. unique entries for multiple charge states or isotopes of an EMRT (exact mass-retention time features).

Then, p-values for Regular peptides are computed based on the Regular and Random database types score distributions, as described in KÃ¤ll et al., 2008a. Only unique peptide sequences are taken into account: in case of duplicated peptides, only one entry is kept. Empirical p-values are adjusted using Bonferroni and Benjamini and Hochberg, 1995 (multtest package) and q-values are computed using the qvalue package (Storey JD and Tibshirani R., 2003 and KÃ¤ll et al., 2008b). Only Regular entries are stored in the resulting data for subsequent analysis.

The data tables can be exported as csv spreadsheets with the writeIdentPeptides and writeQuantPeptides methods.

Filtering identification and quantitation peptide

The first step of the analysis aims to match reliable peptide. The final peptide datasets are filtered based on the FDR (BH is default) using the filterQuantPepScore and filterIdentPepScore methods. Several plots are provided to illustrate peptide score densities (from which p-values are estimated, plotPepScores; use getPepNumbers to see how many peptides were available) and q-values (plotFdr).

Peptides matching to multiple proteins in the fasta file (non-unique tryptic identification and quantitation peptides) can be discarded with the filterUniqueDbPeptides method. One can also filter on the peptide length using filterPeptideLength.

Another filtering criterion is mass accuracy. Error tolerance quantiles (in ppm, parts per million) can be visualised with the plotPpmError method. The values can be retrieved with getPpmErrorQs. Filtering is then done separately for identification and quantitation peptide data using filterIdentPpmError and filterQuantPpmError respectively. The previous plotting functions can be used again to visualise the resulting distribution.

Filtering can also be performed at the level of protein false positive rate, as computed by the PLGS application (protein.falsePositiveRate column), which counts the percentage of decoy proteins that have been identified prior to the regular protein of interest. This can be done with the filterIdentProtFpr and filterQuantProtFpr methods. Note that this field is erroneously called a false positive rate in the PLGS software and the associated manuscript; it is a false discovery rate.

Merging identification and quantitation peptides

Common and reliable identification and quantitation peptides are then matched based on their sequences and merged using the mergePeptides method.

Retention time modelling

Systematic differences between identification features and quantitation features retention times are modelled by fitting a local regression (see the loess function for details), using the modelRt method. The smoothing parameter, or number of neighbour data points used the for local fit, is controlled by the span parameter that can be set in the above method.

The effect of this parameter can be observed with the plotRt method, specifying what = "data" as parameters. The resulting model can then be visualised with the above method specifying what = "model", specifying up to 3 number of standard deviations to plot. A histogram of retention time differences can be produced with the plotRtDiffs method.

Mention plotFeatures here.

Grid search to optimise matching tolerances

Matching of identification peptides and quantitation EMRTs is done within a mass tolerance in parts per million (ppm) and the modelled retention time +/- a certain number of standard deviations. To help in the choice of these two parameters, a grid search over a set of possible values is performed and performance metrics are recorded, to guide in the selection of a 'best' pair of parameters.

The following metrics are computed: (1) the percentage of identification peptides that matched a single quantitation EMRT (called prcntTotal), (2) the percentage of identification peptides used in the retention time model that matched the quantitation EMRT corresponding to the correct quantitation peptide in ident/quant pair of the model (called prcntModel) and (3) the detailed about the matching of the features used for modelling (accessible with getGridDetails) and the corresponding details grid that reports the percentage of correct unique assignments. The detailed grid results specify the number of non matched identification peptides (0), the number of correctly (1) or wrongly (-1) uniquely matched identification peptides, the number of identification peptides that matched 2 or more peptides including (2+) or excluding (2-) the correct quantitation equivalent are also available.

See the next section for additional details about how matching. The search is performed with the searchGrid method, possibly on a subset of the data (see Methods and Examples sections for further details).

The parameters used for matching can be set manually with setPpmError and setRtNsd respectively, or using setBestGridParams to apply best parameters as defined using the grid search. See example and method documentation for details.

Identification transfer: matching identification peptides and quantitation EMRTs

The identification peptide - quantitation EMRT matching, termed identification transfer, is performed using the best parameters, as defined above with a grid search, or using user-defined parameters.

Matching is considered successful when one and only one EMRT is found in the mass tolerance/retention time window defined by the error ppm and number of retention time standard deviations parameters. The values of uniquely matched EMRTs are reported in the final matched dataframe that can be exported (see below). If however, none or more than one EMRTs are matched, 0 or the number of matches are reported.

As identification peptides are serially individually matched to 'close' EMRTs, it is possible for peptides to be matched the same EMRT independently. Such cases are reported as -1 in the results dataframes.

The results can be assess using the plotEMRTtable (or getEMRTtable to retrieve the values) and performace methods. The former shows the number of identification peptides assigned to none (0), exactly 1 (1) or more (> 2) EMRTs. The latter method reports matched identification peptides, the number of (q-value and protein FPR filtered) identification and quantitation peptides. Matched EMRT and quantitation peptide numbers are then compared calculating the synapter enrichment (100 * ( synapter - quant ) / quant) and Venn counts.

Exporting and saving data

The merged identification and quantitation peptides can be exported to csv using the writeMergedPeptides method. Similarly, the matched identification peptides and quantitation EMRTs are exported with writeMatchedEMRTs.

Complete Synapter instances can be serialised with save, as any R object, and reloaded with load for further analysis.

Methods

Analysis methods

mergePeptides: signature(object = "Synapter"): Merges quantitation and identification final peptide data, used to perform retention time modelling (see modelRt below).
modelRt: signature(object = "Synapter", span = "numeric"): Performs local polynomial regression fitting (see loess) retention time alignment using span parameter value to control the degree of smoothing.
findEMRTs: signature(object = "Synapter", ppm = "numeric", nsd = "numeric", mergedEMRTs = c("rescue", "copy", "transfer")): Finds EMRTs matching identification peptides using ppm mass tolerance and nsd number of retention time standard deviations. The last two parameters are optional if previously set with setPpmError and setRtNsd or, better, setBestGridParams (see below). The mergedEMRTs parameter defined the behaviour for those high confidence features that where identified in both identification and quantitation acquisitions and used for the retention time model (see mergePeptides). Prior to version 1.1.1, these features were transferred from the quantitation pep3d file if unique matches were found, as any feature ("transfer"). As a result, those matching 0 or > 1 EMRTs were quantified as NA. The default is now to "rescue" the quantitation values of these by directly retrieving the data from the quantification peptide data. Alternatively, the quantitation values for these features can be directly taken from the quantitation peptide data using "copy", thus effectively bypassing identification transfer.
searchGrid: signature(object="Synapter", ppms="character", nsds="character", subset="numeric", n = "numeric", verbose="logical"): Performs a grid search. The grid is defined by the ppm and nsd numerical vectors, representing the sequence of values to be tested. Default are seq(5, 20, 2) and seq(0.5, 5, 0.5) respectively. subset and n allow to use a randomly chosen subset of the data for the grid search to reduce search time. subset is a numeric, between 0 and 1, describing the percentage of data to be used; n specifies the absolute number of feature to use. The default is to use all data. verbose controls whether textual output should be printed to the console. (Note, the mergedEMRTs value used in internal calls to findEMRTs is "transfer" - see findEMRTs for details).

Methods to display, access and set data

show: signature(object = "Synapter"): Display object by printing a summary to the console.
dim: signature(x="Synapter"): Returns a list of dimensions for the identification peptide, quantitation peptide, merged peptides and matched features data sets.
inputFiles: signature(object="Synapter"): Returns a character of length 4 with the names of the input files used as identpeptide, quantpeptide, quantpep3d and fasta.
getLog: signature(object="Synapter"): Returns a character of variable length with a summary of processing undergone by object.
getGrid: signature(object="Synapter", digits = "numeric"): Returns a named list of length 3 with the precent of total (prcntTotal), percent of model (prcntModel) and detailed (details) grid search results. The details grid search reports the proportion of correctly assigned features (+1) to all unique assignments (+1 and -1). Values are rounded to 3 digits by default.
getGridDetails: signature(object="Synapter"): Returns a list of number of ..., -2, -1, 0, +1, +2, ... results found for each of the ppm/nsd pairs tested during the grid search.
getBestGridValue: signature(object="Synapter"): Returns a named numeric of length 3 with best grid values for the 3 searches. Names are prcntTotal, prcntModel and details.
getBestGridParams: signature(object="Synapter"): Returns a named list of matrices (prcntTotal, prcntModel and details). Each matrix gives the ppm and nsd pairs that yielded the best grid values (see getBestGridValue above).
setBestGridParams: signature(object="Synapter", what="character"): This methods set the best parameter pair, as determined by what. Possible values are auto (default), model, total and details. The 3 last ones use the (first) best parameter values as reported by getBestGridParams. auto uses the best model parameters and, if several best pairs exists, the one that maximises details is selected.
setPepScoreFdr: signature(object="Synapter", fdr = "numeric"): Sets the peptide score false discovery rate (default is 0.01) threshold used by filterQuantPepScore and filterIdentPepScore.
getPepScoreFdr: signature(object="Synapter"): Returns the peptide false discrovery rate threshold.
setIdentPpmError: signature(object="Synapter", ppm = "numeric"): Set the identification mass tolerance to ppm (default 10).
getIdentPpmError: signature(object="Synapter"): Returns the identification mass tolerance.
setQuantPpmError: signature(object="Synapter", ppm = "numeric"): Set the quantitation mass tolerance to ppm (default 10).
getQuantPpmError: signature(object="Synapter"): Returns the quantitation mass tolerance.
setPpmError: signature(object="Synapter", ppm = "numeric"): Sets the identification and quantitation mass tolerance ppm (default is 10).
setLowessSpan: signature(object="Synapter", span = "numeric"): Sets the loess span parameter; default is 0.05.
getLowessSpan: signature(object="Synapter"): Returns the span parameter value.
setRtNsd: signature(object="Synapter", nsd = "numeric"): Sets the retention time tolerance nsd, default is 2.
getRtNsd: signature(object="Synapter"): Returns the value of the retention time tolerance nsd.
getPpmErrorQs: signature(object="Synapter", qs = "numeric", digits = "numeric"): Returns the mass tolerance qs quantiles (default is c(0.25, 0.5, 0.75, seq(0.9, 1, 0.01)) for the identification and quantitation peptides. Default is 3 digits.
getRtQs: signature(object="Synapter", qs = "numeric", digits = "numeric"): Returns the retention time tolerance qs quantiles (default is c(0.25, 0.5, 0.75, seq(0.9, 1, 0.01)) for the identification and quantitation peptides. Default is 3 digits.
getPepNumbers: signature(object="Synapter"): Returns the number of regular and random quantitation and identification peptide considered for p-value calculation and used to plot the score densities (see plotPepScores). Especially the difference between random and regular entries are informative in respect with the confidence of the random scores distribution.
showFdrStats: signature(object="Synapter", k = "numeric"): Returns a named list of length 2 with the proportion of identification and quantitation peptides that are considered significant with a threshold of k (default is c(0.001, 0.01, 0.5, 0.1)) using raw and adjusted p-values/q-values.
getEMRTtable: signature(object="Synapter"): Returns a table with the number of 0, 1, 2, ... assigned EMRTs.
performance: signatute(object="Synapter", verbose = TRUE): Returns (and displays, if verbose) the performance of the synapter analysis.
performance2: signatute(object="Synapter", verbose = TRUE): Returns (and displays, if verbose) information about number of missing values and identification source of transfered EMRTs.

Filters

filterUniqueDbPeptides: signature(object="Synapter"): This method first digests the fasta database file and keeps unique tryptic peptides. (NOTE: since version 1.5.3, the tryptic digestion uses the cleaver package, replacing the more simplistic inbuild function. The effect of this change is documented in https://github.com/lgatto/synapter/pull/47). The number of missed cleavages can be set as missedCleavages (default is 0). Those peptide sequences are then used as a filter against the identification and quantitation peptides, where only unique proteotyptic instances (no miscleavage allowed by default) are eventually kept in the object instance. This method also removes any additional duplicated peptides, that would not match any peptides identified in the fasta database.
filterUniqueQuantDbPeptides: signature(object="Synapter", missedCleavages = 0, verbose = TRUE): As filterUniqueDbPeptides for quantitation peptides only.
filterUniqueIdentDbPeptides: signature(object="Synapter", missedCleavages = 0, verbose = TRUE): As filterUniqueDbPeptides for identification peptides only.
filterQuantPepScore: signature(object="Synapter", fdr = "numeric", method = "character"): Filters the quantitation peptides using fdr false discovery rate. fdr is missing by default and is retrieved with getPepScoreFdr automatically. If not set, default value of 0.01 is used. method defines how to performe p-value adjustment; one of BH, Bonferrone or qval. See details section for more information.
filterIdentPepScore: signature(object="Synapter", fdr = "numeric", method = "charactet"): As filterQuantPepScore, but for identification peptides.
filterQuantProtFpr: signature(object="Synapter", fpr = "numeric"): Filters quantitation peptides using the protein false positive rate (erroneously defined as a FPR, should be FDR), as reported by PLGS, using threshold set by fpr (missing by default) or retrieved by getProtFpr.
filterIdentProtFpr: signature(object="Synapter", fpr = "numeric"): as filterQuantProtFpr, but for identification peptides.
filterQuantPpmError: signature(object="Synapter", ppm = "numeric"): Filters the quantitation peptides based on the mass tolerance ppm (default missing) provided or retrieved automatically using getPpmError.
filterIdentPpmError: signature(object="Synapter"): as filterQuantPpmError, but for identification peptides.

Plotting

plotPpmError: signature(object="Synapter", what = "character"): Plots the proportion of data against the mass error tolerance in ppms. Depending on what, the data for identification (what = "Ident"), quantitation (what = "Quant") or "both" is plotted.
plotRtDiffs: signature(object="Synapter", ...): Plots a histogram of retention time differences after alignments. ... is passed to hist.
plotRt: signature(object="Synapter", what = "character", f = "numeric", nsd = "numeric"): Plots the Identification - Quantitation retention time difference as a function of the Identification retention time. If what is "data", two plots are generated: one ranging the full range of retention time differences and one focusing on the highest data point density and showing models with various span parameter values, as defined by f (default is 2/3, 1/2, 1/4, 1/10, 1/16, 1/25, 1/50, passed as a numed numeric). If what is "model", a focused plot with the applied span parameter is plotted and areas of nsd (default is x(1, 3, 5) number of standard deviations are shaded around the model.
plotPepScores: signature(object="Synapter"): Plots the distribution of random and regular peptide scores for identification and quantitation features. This reflects how peptide p-values are computed. See also getPepNumbers.
plotFdr: signature(object="Synapter", method = "character"): Displays 2 plots per identification and quantitation peptides, showing the number of significant peptides as a function of the FDR cut-off and the expected false number of false positive as a number of significant tests. PepFrag 1 and 2 peptides are illustrated on the same figures. These figures are adapted from plot.qvalue. method, one of "BH", "Bonferroni" or "qval", defines what identification statistics to use.
plotEMRTtable: signature(object="Synapter"): Plots the barchart of number or 0, 1, 2, ... assigned EMRTs (see getEMRTtable) .
plotGrid: signature(object="Synapter", what = "character"): Plots a heatmap of the respective grid search results. This grid to be plotted is controlled by what: "total", "model" or "details" are available.
plotFeatures: signature(object="Synapter", what = "character", xlim = "numeric", ylim = "numeric"): Plots the retention time against precursor mass space. If what is "all", three such plots are created side by side: for the identification peptides, the quantitation peptides and the quantitation Pep3D data. If what is "some", a subset of the rt/mass space can be defined with xlim (default is c(40, 60)) and ylim (default is c(1160, 1165)) and identification peptide, quantitation peptides and EMRTs are presented on the same graph as grey dots, blue dots and red crosses respectively. In addition, rectangles based on the ppm and nsd defined tolerances (see setPpmError and setNsdError) are drawn and centered at the expected modelled retention time. This last figure allows to visualise the EMRT matching.

Exporters

writeMergedPeptides: signature(object="Synapter", file = "character", what = "character", ...): Exports the merged peptide data to a comma-separated file (default name is "Res-MergedPeptides.csv"). what can be "light" (default) or "full" and specifies if the full data or only selected columns are exported. ... are passed to write.csv.
writeMatchedEMRTs: signature(object="Synapter", file = "character", what = "character", ...): As above, saving the matched EMRT table.
writeIdentPeptides: signature(object="Synapter", file = "character", ...): As above, exporting the identification peptide data.
writeQuantPeptides: signature(object="Synapter", file = "character", ...): A above, exporting the quantitation peptide data.

Other

as(, "MSnSet"): signature(x = "Synapter"): Coerce object from Synapter to MSnSet class.

Author(s)

Laurent Gatto lg390@cam.ac.uk

References

KÃ¤ll L, Storey JD, MacCoss MJ, Noble WS Posterior error probabilities and false discovery rates: two sides of the same coin. J Proteome Res. 2008a Jan; 7:(1)40-4

Bonferroni single-step adjusted p-values for strong control of the FWER.

Benjamini Y. and Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Statist. Soc. B., 1995, Vol. 57: 289-300.

Storey JD and Tibshirani R. Statistical significance for genome-wide experiments. Proceedings of the National Academy of Sciences, 2003, 100: 9440-9445.

KÃ¤ll, Storey JD, MacCoss MJ, Noble WS Assigning significance to peptides identified by tandem mass spectrometry using decoy databases. J Proteome Res. 2008b Jan; 7:(1)29-34

Improving qualitative and quantitative performance for MSE-based label free proteomics, N.J. Bond, P.V. Shliaha, K.S. Lilley and L. Gatto, Journal of Proteome Research, 2013, in press.

The Effects of Travelling Wave Ion Mobility Separation on Data Independent Acquisition in Proteomics Studies, P.V. Shliaha, N.J. Bond, L. Gatto and K.S. Lilley, Journal of Proteome Research, 2013, in press.

Examples

library(synapter) ## always needed

## Not run: 
## (1) Construction - to create your own data objects
synapterTiny <- Synapter()

## End(Not run)

## let's use synapterTiny, shipped with the package
synapterTinyData() ## loads/prepares the data
synapterTiny ## show object

## (2) Filtering
## (2.1) Peptide scores and FDR

## visualise/explore peptide id scores
plotPepScores(synapterTiny)
getPepNumbers(synapterTiny)

## filter data
filterUniqueDbPeptides(synapterTiny) ## keeps unique proteotypic peptides
filterPeptideLength(synapterTiny, l = 7) ## default length is 7

## visualise before FDR filtering
plotFdr(synapterTiny)

setPepScoreFdr(synapterTiny, fdr = 0.01) ## optional
filterQuantPepScore(synapterTiny, fdr = 0.01) ## specifying FDR
filterIdentPepScore(synapterTiny) ## FDR not specified, using previously set value

## (2.2) Mass tolerance
getPpmErrorQs(synapterTiny)
plotPpmError(synapterTiny, what="Ident")
plotPpmError(synapterTiny, what="Quant")

setIdentPpmError(synapterTiny, ppm = 20) ## optional
filterQuantPpmError(synapterTiny, ppm = 20)
## setQuantPpmError(synapterTiny, ppm = 20) ## set quant ppm threshold below
filterIdentPpmError(synapterTiny, ppm=20)

filterIdentProtFpr(synapterTiny, fpr = 0.01)
filterQuantProtFpr(synapterTiny, fpr = 0.01)

getPpmErrorQs(synapterTiny) ## to be compared with previous output

## (3) Merge peptide sequences
mergePeptides(synapterTiny)

## (4) Retention time modelling
plotRt(synapterTiny, what="data")
setLowessSpan(synapterTiny, 0.05)
modelRt(synapterTiny) ## the actual modelling
getRtQs(synapterTiny)
plotRtDiffs(synapterTiny)
## plotRtDiffs(synapterTiny, xlim=c(-1, 1), breaks=500) ## pass parameters to hist()
plotRt(synapterTiny, what="model") ## using default nsd 1, 3, 5
plotRt(synapterTiny, what="model", nsd=0.5) ## better focus on model

plotFeatures(synapterTiny, what="all")
setRtNsd(synapterTiny, 3)     ## RtNsd and PpmError are used for detailed plot
setPpmError(synapterTiny, 10) ## if not set manually, default values are set automatically
plotFeatures(synapterTiny, what="some", xlim=c(36,44), ylim=c(1161.4, 1161.7))
## best plotting to svg for zooming

set.seed(1) ## only for reproducibility of this example

## (5) Grid search to optimise EMRT matching parameters
searchGrid(synapterTiny,
           ppms = 7:10,  ## default values are 5, 7, ..., 20
           nsds = 1:3,   ## default values are 0.5, 1,  ..., 5
           subset = 0.2) ## default is 1
## alternatively, use 'n = 1000' to use exactly
## 1000 randomly selected features for the grid search
getGrid(synapterTiny)  ## print the grid
getGridDetails(synapterTiny)  ## grid details
plotGrid(synapterTiny, what = "total")   ## plot the grid for total matching
plotGrid(synapterTiny, what = "model")   ## plot the grid for matched modelled feature
plotGrid(synapterTiny, what = "details") ## plot the detail grid
getBestGridValue(synapterTiny)  ## return best grid values
getBestGridParams(synapterTiny) ## return parameters corresponding to best values
setBestGridParams(synapterTiny, what = "auto") ## sets RtNsd and PpmError according the grid results
## 'what' could also be "model", "total" or "details"
## setPpmError(synapterTiny, 12) ## to manually set values
## setRtNsd(synapterTiny, 2.5)

## (6) Matching ident peptides and quant EMRTs
findEMRTs(synapterTiny)
plotEMRTtable(synapterTiny)
getEMRTtable(synapterTiny)
performance(synapterTiny)
performance2(synapterTiny)

## (7) Exporting data to csv spreadsheets
writeMergedPeptides(synapterTiny, what = "light") ## or what="full"
writeMergedPeptides(synapterTiny, file = "myresults.csv", what="light")
writeMatchedEMRTs(synapterTiny, what = "light")   ## or what="full"
writeMatchedEMRTs(synapterTiny, file = "myresults2.csv", what="light")
## These will export the filter peptide data
writeIdentPeptides(synapterTiny, file = "myIdentPeptides.csv")
writeQuantPeptides(synapterTiny, file = "myQuantPeptides.csv")
## If used right after loading, the non-filted data will be exported

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(synapter)
Loading required package: MSnbase
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: 'BiocGenerics'

The following objects are masked from 'package:parallel':

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from 'package:stats':

    IQR, mad, xtabs

The following objects are masked from 'package:base':

    Filter, Find, Map, Position, Reduce, anyDuplicated, append,
    as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
    get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
    match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
    rbind, rownames, sapply, setdiff, sort, table, tapply, union,
    unique, unsplit

Loading required package: Biobase
Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.

Loading required package: mzR
Loading required package: Rcpp
Loading required package: BiocParallel
Loading required package: ProtGenerics

This is MSnbase version 1.20.7 
  Read '?MSnbase' and references therein for information
  about the package and how to get started.


Attaching package: 'MSnbase'

The following object is masked from 'package:stats':

    smooth

The following object is masked from 'package:base':

    trimws

This is the 'synapter' version 1.14.2.
Read '?synapter' to get started.

> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/synapter/Synapter.Rd_%03d_medium.png", width=480, height=480)
> ### Name: Synapter
> ### Title: Class "Synapter"
> ### Aliases: Synapter class:Synapter show,Synapter-method
> ###   dim,Synapter-method inputFiles,Synapter-method inputFiles
> ###   getLog,Synapter-method getLog mergePeptides,Synapter-method
> ###   mergePeptides modelRt,Synapter-method modelRt
> ###   findEMRTs,Synapter-method findEMRTs searchGrid,Synapter-method
> ###   searchGrid getGrid,Synapter-method getGrid
> ###   getGridDetails,Synapter-method getGridDetails
> ###   getBestGridValue,Synapter-method getBestGridValue
> ###   getBestGridParams,Synapter-method getBestGridParams
> ###   setBestGridParams,Synapter-method setBestGridParams
> ###   setPepScoreFdr,Synapter-method setPepScoreFdr
> ###   getPepScoreFdr,Synapter-method getPepScoreFdr
> ###   setIdentPpmError,Synapter-method setIdentPpmError
> ###   getIdentPpmError,Synapter-method getIdentPpmError
> ###   setQuantPpmError,Synapter-method setQuantPpmError
> ###   getQuantPpmError,Synapter-method getQuantPpmError
> ###   setPpmError,Synapter-method setPpmError setLowessSpan,Synapter-method
> ###   setLowessSpan getLowessSpan,Synapter-method getLowessSpan
> ###   setRtNsd,Synapter-method setRtNsd getRtNsd,Synapter-method getRtNsd
> ###   getPpmErrorQs,Synapter-method getPpmErrorQs getRtQs,Synapter-method
> ###   getRtQs getPepNumbers,Synapter-method getPepNumbers
> ###   showFdrStats,Synapter-method showFdrStats setProtFpr,Synapter-method
> ###   setProtFpr getProtFpr,Synapter-method getProtFpr
> ###   getEMRTtable,Synapter-method getEMRTtable performance,Synapter-method
> ###   performance2,Synapter-method performance performance2
> ###   filterPeptideLength,Synapter-method filterPeptideLength
> ###   filterUniqueDbPeptides,Synapter-method filterUniqueDbPeptides
> ###   filterUniqueQuantDbPeptides,Synapter-method
> ###   filterUniqueQuantDbPeptides
> ###   filterUniqueIdentDbPeptides,Synapter-method
> ###   filterUniqueIdentDbPeptides filterQuantPepScore,Synapter-method
> ###   filterQuantPepScore filterIdentPepScore,Synapter-method
> ###   filterIdentPepScore filterIdentPpmError,Synapter-method
> ###   filterIdentPpmError filterQuantPpmError,Synapter-method
> ###   filterQuantPpmError filterIdentProtFpr,Synapter-method
> ###   filterIdentProtFpr filterQuantProtFpr,Synapter-method
> ###   filterQuantProtFpr plotEMRTtable,Synapter-method plotEMRTtable
> ###   plotFeatures,Synapter-method plotFeatures plotGrid,Synapter-method
> ###   plotGrid plotPepScores,Synapter-method plotPepScores
> ###   plotPpmError,Synapter-method plotPpmError plotFdr,Synapter-method
> ###   plotFdr plotRt,Synapter-method plotRt plotRtDiffs,Synapter-method
> ###   plotRtDiffs writeMatchedEMRTs,Synapter-method writeMatchedEMRTs
> ###   writeMergedPeptides,Synapter-method writeMergedPeptides
> ###   writeIdentPeptides,Synapter-method writeIdentPeptides
> ###   writeQuantPeptides,Synapter-method writeQuantPeptides
> ###   as.MSnSet.Synapter
> ### Keywords: classes
> 
> ### ** Examples
> 
> library(synapter) ## always needed
> 
> ## Not run: 
> ##D ## (1) Construction - to create your own data objects
> ##D synapterTiny <- Synapter()
> ## End(Not run)
> 
> ## let's use synapterTiny, shipped with the package
> synapterTinyData() ## loads/prepares the data
> synapterTiny ## show object
Object of class "Synapter" 
Package version 0.9.0 
Data files:
 + Quantitation pep file: 02_MSe_tiny.csv 
 + Identification pep file: 01_HDMSe_tiny.csv 
 + Quantitation Pep3DAMRT file: 03_Pep3D_tiny.csv 
 + Fasta file: 04_test_database.fasta 
Log:
[1] "Instance created on Wed Jun  6 23:37:36 2012"
[2] "Read identification peptide data [5915,18]"  
[ 8 lines ]
[11] "Added identification statistics to identification data"
[12] "Filtered identification Random entries [4979,22]"
> 
> ## (2) Filtering
> ## (2.1) Peptide scores and FDR
> 
> ## visualise/explore peptide id scores
> plotPepScores(synapterTiny)
> getPepNumbers(synapterTiny)
      PepFrag1.Random PepFrag1.Regular PepFrag2.Random PepFrag2.Regular
ident             256             3522             448             1393
quant             496             2664             500              745
> 
> ## filter data
> filterUniqueDbPeptides(synapterTiny) ## keeps unique proteotypic peptides
Fasta file: 897 proteins
            21914 out of 22960 tryptic peptides are proteotypic.
> filterPeptideLength(synapterTiny, l = 7) ## default length is 7
> 
> ## visualise before FDR filtering
> plotFdr(synapterTiny)
> 
> setPepScoreFdr(synapterTiny, fdr = 0.01) ## optional
> filterQuantPepScore(synapterTiny, fdr = 0.01) ## specifying FDR
> filterIdentPepScore(synapterTiny) ## FDR not specified, using previously set value
> 
> ## (2.2) Mass tolerance
> getPpmErrorQs(synapterTiny)
        25%   50%   75%   90%   91%   92%   93%   94%    95%    96%    97%
Ident 0.982 2.183 4.070 6.750 7.663 8.299 9.136 9.989 12.060 13.997 16.082
Quant 0.432 0.972 1.682 3.083 3.341 3.606 3.992 5.021  5.733  6.455  7.736
         98%    99%   100%
Ident 19.288 22.496 28.664
Quant  9.124 14.897 24.599
> plotPpmError(synapterTiny, what="Ident")
> plotPpmError(synapterTiny, what="Quant")
> 
> setIdentPpmError(synapterTiny, ppm = 20) ## optional
> filterQuantPpmError(synapterTiny, ppm = 20)
> ## setQuantPpmError(synapterTiny, ppm = 20) ## set quant ppm threshold below
> filterIdentPpmError(synapterTiny, ppm=20)
> 
> filterIdentProtFpr(synapterTiny, fpr = 0.01)
> filterQuantProtFpr(synapterTiny, fpr = 0.01)
> 
> getPpmErrorQs(synapterTiny) ## to be compared with previous output
        25%   50%   75%   90%   91%   92%   93%   94%   95%    96%    97%
Ident 0.968 2.144 3.939 6.233 6.589 7.035 7.942 8.775 9.523 10.688 12.643
Quant 0.431 0.964 1.661 2.973 3.137 3.386 3.702 4.214 5.211  6.095  7.117
         98%    99%   100%
Ident 14.563 17.007 19.744
Quant  8.289  9.558 19.817
> 
> ## (3) Merge peptide sequences
> mergePeptides(synapterTiny)
> 
> ## (4) Retention time modelling
> plotRt(synapterTiny, what="data")
> setLowessSpan(synapterTiny, 0.05)
> modelRt(synapterTiny) ## the actual modelling
> getRtQs(synapterTiny)
   25%    50%    75%    90%    91%    92%    93%    94%    95%    96%    97% 
 0.003  0.008  0.016  0.029  0.031  0.034  0.040  0.047  0.057  0.072  0.463 
   98%    99%   100% 
 4.161 10.703 32.140 
> plotRtDiffs(synapterTiny)
> ## plotRtDiffs(synapterTiny, xlim=c(-1, 1), breaks=500) ## pass parameters to hist()
> plotRt(synapterTiny, what="model") ## using default nsd 1, 3, 5
> plotRt(synapterTiny, what="model", nsd=0.5) ## better focus on model
> 
> plotFeatures(synapterTiny, what="all")
> setRtNsd(synapterTiny, 3)     ## RtNsd and PpmError are used for detailed plot
> setPpmError(synapterTiny, 10) ## if not set manually, default values are set automatically
> plotFeatures(synapterTiny, what="some", xlim=c(36,44), ylim=c(1161.4, 1161.7))
> ## best plotting to svg for zooming
> 
> set.seed(1) ## only for reproducibility of this example
> 
> ## (5) Grid search to optimise EMRT matching parameters
> searchGrid(synapterTiny,
+            ppms = 7:10,  ## default values are 5, 7, ..., 20
+            nsds = 1:3,   ## default values are 0.5, 1,  ..., 5
+            subset = 0.2) ## default is 1
   |                                                                               |                                                                      |   0%   |                                                                               |======                                                                |   8%   |                                                                               |============                                                          |  17%   |                                                                               |==================                                                    |  25%   |                                                                               |=======================                                               |  33%   |                                                                               |=============================                                         |  42%   |                                                                               |===================================                                   |  50%   |                                                                               |=========================================                             |  58%   |                                                                               |===============================================                       |  67%   |                                                                               |====================================================                  |  75%   |                                                                               |==========================================================            |  83%   |                                                                               |================================================================      |  92%   |                                                                               |======================================================================| 100%
> ## alternatively, use 'n = 1000' to use exactly
> ## 1000 randomly selected features for the grid search
> getGrid(synapterTiny)  ## print the grid
$prcntTotal
      7     8     9    10
1 0.747 0.755 0.760 0.760
2 0.824 0.833 0.839 0.839
3 0.824 0.833 0.839 0.839

$prcntModel
      7     8     9    10
1 0.872 0.872 0.872 0.872
2 0.972 0.972 0.972 0.972
3 0.972 0.972 0.972 0.972

$details
      7     8     9    10
1 0.984 0.984 0.984 0.984
2 0.986 0.986 0.986 0.986
3 0.986 0.986 0.986 0.986

> getGridDetails(synapterTiny)  ## grid details
$`1:7`
 -1   0   1 
  4  33 251 

$`1:8`
 -1   0   1 
  4  33 251 

$`1:9`
 -1   0   1 
  4  33 251 

$`1:10`
 -1   0   1 
  4  33 251 

$`2:7`
 -1   0   1 
  4   4 280 

$`2:8`
 -1   0   1 
  4   4 280 

$`2:9`
 -1   0   1 
  4   4 280 

$`2:10`
 -1   0   1 
  4   4 280 

$`3:7`
 -1   0   1 
  4   4 280 

$`3:8`
 -1   0   1 
  4   4 280 

$`3:9`
 -1   0   1 
  4   4 280 

$`3:10`
 -1   0   1 
  4   4 280 

> plotGrid(synapterTiny, what = "total")   ## plot the grid for total matching
> plotGrid(synapterTiny, what = "model")   ## plot the grid for matched modelled feature
> plotGrid(synapterTiny, what = "details") ## plot the detail grid
> getBestGridValue(synapterTiny)  ## return best grid values
prcntTotal prcntModel    details 
 0.8390805  0.9722222  0.9859155 
> getBestGridParams(synapterTiny) ## return parameters corresponding to best values
$prcntTotal
     ppm nsd
[1,]   9   2
[2,]   9   3
[3,]  10   2
[4,]  10   3

$prcntModel
     ppm nsd
[1,]   7   2
[2,]   7   3
[3,]   8   2
[4,]   8   3
[5,]   9   2
[6,]   9   3
[7,]  10   2
[8,]  10   3

$details
     ppm nsd
[1,]   7   2
[2,]   7   3
[3,]   8   2
[4,]   8   3
[5,]   9   2
[6,]   9   3
[7,]  10   2
[8,]  10   3

> setBestGridParams(synapterTiny, what = "auto") ## sets RtNsd and PpmError according the grid results
> ## 'what' could also be "model", "total" or "details"
> ## setPpmError(synapterTiny, 12) ## to manually set values
> ## setRtNsd(synapterTiny, 2.5)
> 
> ## (6) Matching ident peptides and quant EMRTs
> findEMRTs(synapterTiny)
> plotEMRTtable(synapterTiny)
> getEMRTtable(synapterTiny)

   0    1 
 559 2486 
> performance(synapterTiny)
(S) Synapter: 2486 EMRTs uniquely matched.
(I) Ident: 3045 peptides.
(Q) Quant:   1840 peptides.
Enrichment (S/Q): 35.11%
Overlap:
   Q    S   QS 
 433 1077 1407 
> performance2(synapterTiny)
          na.counts
id.source  FALSE TRUE
  rescue      35    0
  transfer  2486  524
> 
> ## (7) Exporting data to csv spreadsheets
> writeMergedPeptides(synapterTiny, what = "light") ## or what="full"
> writeMergedPeptides(synapterTiny, file = "myresults.csv", what="light")
> writeMatchedEMRTs(synapterTiny, what = "light")   ## or what="full"
> writeMatchedEMRTs(synapterTiny, file = "myresults2.csv", what="light")
> ## These will export the filter peptide data
> writeIdentPeptides(synapterTiny, file = "myIdentPeptides.csv")
> writeQuantPeptides(synapterTiny, file = "myQuantPeptides.csv")
> ## If used right after loading, the non-filted data will be exported
> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>