Last data update: 2014.03.03

R: Splitting and merging of data across the time axis.
windowsR Documentation

Splitting and merging of data across the time axis.

Description

Often MCR data sets can be analysed much more quickly and efficiently when split into several smaller time windows. For interpretation purposes, the results after analysis can be merged again.

Usage

splitTimeWindow(datalist, splitpoints, overlap = 0)
mergeTimeWindows(obj, simSThreshold = .9, simCThreshold = .9, verbose = FALSE)

Arguments

datalist

A list of (numerical) data matrices

splitpoints

A numerical vector of cut points. In case the time axis extends beyond the range of the cut points, additional cut points are added at the beginning or at the end of the time axis to ensure that all time points are taken into account.

overlap

Number of points in the overlap region between two consecutive windows. Default: 0 (non-overlapping windows).

obj

Either experimental data that have been split up in different time windows (a list of matrices), or a list of ALS objects. See details section.

simSThreshold, simCThreshold

similarity thresholds to determine whether two patterns are the same (correlation). The two thresholds are checking the spectral and chromatographic components, respectively. If no overlap is present between time windows, simCThreshold is not used.

verbose

logical: print additional information?

Details

When splitting data files, the non-overlapping areas should be at least as big as the overlap areas. If not, the function stops with an error message. Note that the example below is only meant to show the use of the function: the data do not have enough time resolution to allow for a big overlap.

Value

Function splitTimeWindows splits every matrix in a list of data matrices into submatrices corresponding to time windows. This is represented as a list of lists, where each top level element is one time window. Such a time window can then be presented to the ALS algorithm.

Function mergeTimeWindows can be used to merge data matrices as well as ALS result objects. In the first case, for each series of data matrices corresponding to different time windows, one big concatenated matrix will be returned. In the second case, exactly the same will be done for the residual matrices and concentration profiles in the ALS object. Spectral components are assumed to be different in different time windows, unless they have a correlation higher than simSThreshold, in which case they are merged. If overlapping time windows are used, an additional requirement is that the similarity between the concentration profiles in the overlap area must be at least simCThreshold. This similarity again is measured as a correlation.

Author(s)

Ron Wehrens

Examples

## splitting and merging of data files
data(tea)
tea.split <- splitTimeWindow(tea.raw, c(12, 14))
names(tea.split)
sapply(tea.split, length)
lapply(tea.split, function(x) sapply(x, dim))
rownames(tea.split[[1]][[1]])[1:10]
rownames(tea.split[[2]][[1]])[1:10]

tea.merge <- mergeTimeWindows(tea.split)
all.equal(tea.merge, tea.raw)                    ## should be TRUE

tea.split2 <- splitTimeWindow(tea.raw, c(12, 14), overlap = 10)
lapply(tea.split2, function(x) sapply(x, dim))
tea.merge2 <- mergeTimeWindows(tea.split2)
all.equal(tea.merge2, tea.raw)                   ## should be TRUE

## merging of ALS results
data(teaMerged) 
ncomp <- ncol(teaMerged$S)
myPalette <- colorRampPalette(c("black", "red", "blue", "green"))
mycols <- myPalette(ncomp)

## show spectra - plotting only a few of them is much more clear...
plot(teaMerged, what = "spectra", col = mycols, comp.idx = c(2, 6))
legend("top", col = mycols[c(2, 6)], lty = 1, bty = "n",
       legend = paste("C", c(2, 6)))

## show concentration profiles - all six files
plot(teaMerged, what = "profiles", col = mycols)
## only the second file
plot(teaMerged, what = "profiles", mat.idx = 2, col = mycols)
legend("topleft", col = mycols, lty = 1, bty = "n",
       legend = paste("C", 1:ncol(teaMerged$S)))
## Note that components 2 and 6 are continuous across the window borders
## - these are found in all three windows

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(alsace)
Loading required package: ALS
Loading required package: nnls
Loading required package: Iso
Iso 0.0-17
Loading required package: ptw
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/alsace/windows.Rd_%03d_medium.png", width=480, height=480)
> ### Name: windows
> ### Title: Splitting and merging of data across the time axis.
> ### Aliases: windows splitTimeWindow mergeTimeWindows
> ### Keywords: manip
> 
> ### ** Examples
> 
> ## splitting and merging of data files
> data(tea)
> tea.split <- splitTimeWindow(tea.raw, c(12, 14))
> names(tea.split)
[1] "Window 1" "Window 2" "Window 3"
> sapply(tea.split, length)
Window 1 Window 2 Window 3 
       5        5        5 
> lapply(tea.split, function(x) sapply(x, dim))
$`Window 1`
     tday0a tday0b tday01 tday03 tday04
[1,]     37     37     37     37     37
[2,]    209    209    209    209    209

$`Window 2`
     tday0a tday0b tday01 tday03 tday04
[1,]     40     40     40     40     40
[2,]    209    209    209    209    209

$`Window 3`
     tday0a tday0b tday01 tday03 tday04
[1,]     20     20     20     20     20
[2,]    209    209    209    209    209

> rownames(tea.split[[1]][[1]])[1:10]
 [1] "10.2"  "10.25" "10.3"  "10.35" "10.4"  "10.45" "10.5"  "10.55" "10.6" 
[10] "10.65"
> rownames(tea.split[[2]][[1]])[1:10]
 [1] "12.05" "12.1"  "12.15" "12.2"  "12.25" "12.3"  "12.35" "12.4"  "12.45"
[10] "12.5" 
> 
> tea.merge <- mergeTimeWindows(tea.split)
> all.equal(tea.merge, tea.raw)                    ## should be TRUE
[1] TRUE
> 
> tea.split2 <- splitTimeWindow(tea.raw, c(12, 14), overlap = 10)
> lapply(tea.split2, function(x) sapply(x, dim))
$`Window 1`
     tday0a tday0b tday01 tday03 tday04
[1,]     47     47     47     47     47
[2,]    209    209    209    209    209

$`Window 2`
     tday0a tday0b tday01 tday03 tday04
[1,]     60     60     60     60     60
[2,]    209    209    209    209    209

$`Window 3`
     tday0a tday0b tday01 tday03 tday04
[1,]     30     30     30     30     30
[2,]    209    209    209    209    209

> tea.merge2 <- mergeTimeWindows(tea.split2)
> all.equal(tea.merge2, tea.raw)                   ## should be TRUE
[1] TRUE
> 
> ## merging of ALS results
> data(teaMerged) 
> ncomp <- ncol(teaMerged$S)
> myPalette <- colorRampPalette(c("black", "red", "blue", "green"))
> mycols <- myPalette(ncomp)
> 
> ## show spectra - plotting only a few of them is much more clear...
> plot(teaMerged, what = "spectra", col = mycols, comp.idx = c(2, 6))
> legend("top", col = mycols[c(2, 6)], lty = 1, bty = "n",
+        legend = paste("C", c(2, 6)))
> 
> ## show concentration profiles - all six files
> plot(teaMerged, what = "profiles", col = mycols)
> ## only the second file
> plot(teaMerged, what = "profiles", mat.idx = 2, col = mycols)
> legend("topleft", col = mycols, lty = 1, bty = "n",
+        legend = paste("C", 1:ncol(teaMerged$S)))
> ## Note that components 2 and 6 are continuous across the window borders
> ## - these are found in all three windows
> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>