Last data update: 2014.03.03

R: Combining isotope, adduct and homologue series relations in...
combineR Documentation

Combining isotope, adduct and homologue series relations in HRMS data sets.

Description

Combines groups of isotope pattern peaks from pattern.search and groups of adduct peaks adduct.search to components, with information on homologue series relations from homol.search attached. Includes some checks for component plausibility. Needs at least two inputs of (1) isotope pattern relations, (2) adduct relations and (3) homologue series relations. Extracts the most intensive peak per component, thus allowing for consistent comparison of component occurrence across different HRMS data sets.

Individual components and peak relations therein can then be plotted with plotcomp. Numbers for detected isotope m/z differences among components can be summarized with plotisotopes. Subsets of components and HRMS data can be interactively selected for with ms.filter.

Usage

combine(pattern, adduct, homol = FALSE, rules = c(FALSE, FALSE, FALSE), dont = FALSE)

Arguments

pattern

List of type pattern produced by pattern.search. If not used, set to FALSE.

adduct

List of type adduct produced by adduct.search. If not used, set to FALSE.

homol

List of type homol produed by homol.search. If not used, set to FALSE(default).

rules

Vector with three entries of TRUE or FALSE. See rules section.

dont

Numeric vector with one or several values in between 1 and 4, to exclude components with particular warnings; if not used, set to FALSE. See details.

Details

The algorithm sorts relations among peaks in HRMS data sets generated by pattern.search, adduct.search and homol.search to components in a repetition of four consecutive steps. In a first step, and along decreasing peak intensities, individual peaks are checked for being part of an isotope pattern group and thus relatable to other peaks. In a second step, all peaks within this group from the first step are checked for being part of adduct groups, thus relating to more peaks. Step one and two thereby lead to the set of peaks defining a component. In a third step, all peaks in a component are checked for having adduct or isotope pattern relations to other peaks not yet subsumed into the component, e.g. as a result of overlapping isotope pattern groups. These additional peaks are therefore defined as interfering peaks. In a fourth step, all peaks found for a component are, if available, related to homologue series they may be part of. Once assigned to a component, peaks take not further part in subsequent repetitions of step one (except for interfering peaks, if rules[1]=TRUE), or any of the other steps.

Four plausibility checks are implemented, represented by warning indices 1 to 4. The first test checks whether the adduct relations found for the peaks assorted under above steps one and two are consistent. If ambiguous adduct relations (e.g. M+H<->M+K AND M+Na<->M+NH4) are found for at least one peak, warning 1 is tagged to the concerned component. The second test checks whether variations in peak intensities within isotope pattern groups are consistent among the different adducts of the same component. This must account for uncertainty in peak intensities via argument inttol of pattern.search. The third check examines whether interfering peaks occur. The fourth check takes effect if a component consists of merged isotope pattern groups (only relevant if several z-levels were employed, see charges argument of pattern.search). These warning indices can then be used to exclude components affected, using argument dont. For example, dont=c(1,3) excludes components with ambiguous adduct relations and interfering peaks from the final component list.

Value

List of type comp with 7 entries

comp[[1]]

Components. Dataframe with listing of individual components, component IDs and concerned peak IDs and warnings per row. The last columns list m/z, intensity and RT of the most intensive peak in that component.

comp[[2]]

pattern peak list. Entry 1 of list of type pattern produced by pattern.search, i.e. pattern[[1]].

comp[[3]]

adduct peak list. Entry 1 of list of type adduct produced by adduct.search, i.e. adduct[[1]].

comp[[4]]

homologue list. Entry 1 of list of type homol produced by homol.search, i.e. homol[[1]].

comp[[5]]

Peaks in components. Vector of TRUE orFALSE, indicating if a peak in pattern[[1]] or adduct[[1]] is part of a component.

comp[[6]]

Summary.

comp[[7]]

Parameters.

Rules setting

rules[1]: Set to TRUE enables peaks identified as interfering in a component to enter step one of the algorithm (see details).

rules[2]: Set to TRUE to remove single-peaked components.

rules[3]: Set to TRUE to only list components being part of (a) homologue serie(s).

Imbecile

Do not combine adduct pattern groups and/or isotope pattern groups and/or homologue series information from (a) different peak lists or (b) the same peak list differently ordered.

Note

Component IDs are allocated in decreasing peak intensity order of the most intensive peak per component, see section value, comp[[1]]. In contrast, IDs of individual peaks refer to the order in which peaks are provided.

Setting the argument pattern to FALSE skips the first step in the algorithm; adducts group are then only searched for a single peak along decreasing peak intensities. Setting the argument adduct to FALSE skips the second step in the algorithm; no adduct groups are then searched for.

Author(s)

Martin Loos

See Also

pattern.search pattern.search2 adduct.search homol.search plotisotopes plotcomp ms.filter plotisotopes

Examples



######################################################
# (0) Group for isotopologues, adducts & homologues  # 
data(peaklist);
data(adducts);
data(isotopes);
iso<-make.isos(isotopes,
	use_isotopes=c("13C","15N","34S","37Cl","81Br","41K","13C","15N","34S","37Cl","81Br","41K"),
	use_charges=c(1,1,1,1,1,1,2,2,2,2,2,2))
pattern<-pattern.search(
  peaklist,
  iso,
  cutint=10000,
  rttol=c(-0.05,0.05),
  mztol=2,
  mzfrac=0.1,
  ppm=TRUE,
  inttol=0.2,
  rules=c(TRUE,TRUE,TRUE,TRUE,TRUE,TRUE,TRUE,TRUE,TRUE,TRUE,TRUE),
  deter=FALSE,
  entry=50
);
adduct<-adduct.search(
  peaklist,
  adducts,
  rttol=0.05,
  mztol=3,
  ppm=TRUE,
  use_adducts=c("M+K","M+H","M+Na","M+NH4"),
  ion_mode="positive"
);
homol<-homol.search(
	peaklist,
	isotopes,	
	elements=c("C","H","O"),
	use_C=TRUE,
	minmz=5,
	maxmz=120,
	minrt=2,
	maxrt=2,
	ppm=TRUE,
	mztol=3.5,
    rttol=0.5,
	minlength=5,
	mzfilter=FALSE,
	vec_size=3E6,
	spar=.45,
	R2=.98,
	plotit=FALSE
)
##############################################################
# Combine these individual groups to components              #
##############################################################
# (1) Standard setting:                                      #
# Produce a component list, allowing for single-peaked       #
# components and with interfering peaks also listed as indi- #
# vidual components (with inputs pattern,adduct,homol):      #
comp<-combine(
	pattern,
	adduct,
	homol,
	dont=FALSE,
	rules=c(TRUE,FALSE,FALSE)
);
comp[[6]];
##############################################################
# (2) Produce a list with those components related to a homo-#
# logue series only (requires inputs pattern,adduct,homol):  #
comp<-combine(
	pattern,
	adduct,
	homol,
	dont=FALSE,
	rules=c(TRUE,FALSE,TRUE)
);
comp[[6]];
##############################################################
# (3) Extract only components that are plausible and contain #
# more than one peak per component, without homologue series #
# information attached (with inputs pattern and adduct):     #
comp<-combine(
	pattern,
	adduct,
	homol=FALSE,
	dont=c(1,2,3),
	rules=c(TRUE,TRUE,FALSE)
);
comp[[6]];
##############################################################


Results