Last data update: 2014.03.03

R: Implement an Alternative Gap-fill Algorithm
ExtendR Documentation

Implement an Alternative Gap-fill Algorithm

Description

By default, the Gapfill function uses the Subset and Predict functions to predict missing values. To implement alternative gap-fill procedures, these functions can be replaced by user defined ones and passed to the Gapfill function via the arguments fnSubset and fnPredict.
The example section below gives two such extensions:

Example 1:

Illustration of the concept. The prediction is the mean of the subset around a missing value.

Example 2:

An algorithm using the Score and the lm functions.

Details

To work properly the user-defined Subset function needs to have the arguments:

data:

The input data array.

mp:

Numeric vector of length 4 specifying the index of the currently treated missing value.

i:

Integer vector of length 1. Number of non-successfully tried subsets.

The function user-defined Predict function, needs to have the arguments:

a:

Return value of the Subset function.

i:

Integer vector of length 1. Number of non-successfully tried subsets.

Both functions may take additional arguments. The default values of these arguments can be changed via the ... arguments of Gapfill.

Author(s)

Florian Gerber, florian.gerber@math.uzh.ch.

References

F. Gerber, R. Furrer, G. Schaepman-Strub, R. de Jong, M. E. Schaepman, 2016, Predicting missing values in spatio-temporal satellite data. http://arxiv.org/abs/1605.01038.

See Also

Gapfill, Subset-Predict, Score, lm.

Examples

## Not run: 
## Example 1: mean ----------------------------------
## define a predict function
PredictMean <- function (a, i) mean(a, na.rm = TRUE)

out1 <- Gapfill(data = ndvi, fnPredict = PredictMean)
Image(out1$fill)

## start with a smaller subset
args(Subset)
out2 <- Gapfill(data = ndvi, fnPredict = PredictMean,
                initialSize = c(0, 0, 1, 6))
Image(out2$fill)

## require at least "nNotNA" non-NA values
## return predicted value and number of iterations i
PredictMean2 <- function (a, i, nNotNA) {
    if (sum(!is.na(a)) < nNotNA)
        return (c(NA, NA))
    c(mean(a, na.rm = TRUE), i)
}
out3 <- Gapfill(data = ndvi, fnPredict = PredictMean2, nPredict = 2,
                initialSize = c(0, 0, 1, 6), nNotNA = 0)
stopifnot(identical(c(out2$fill), c(out3$fill[,,,,1])))
Image(out3$fill[,,,,2])  # number of used iterations i

out4 <- Gapfill(data = ndvi, fnPredict = PredictMean2, nPredict = 2,
                initialSize = c(0, 0, 1, 6), nNotNA = 50)
Image(out4$fill[,,,,1])  # fill values
Image(out4$fill[,,,,2])  # number of used iterations i


## Example 2: Score() and lm() ----------------------
PredictLm <- function (a, i, nNotNA = 50, minScores = 2){
    if (sum(!is.na(a)) < nNotNA)
        return (NA)
    am <- Array2Matrix(a)
    sx <- Score(t(am))
    lsx <- length(sx)
    if (lsx < minScores)
        return (NA)
    sy <- Score(am)
    lsy <- unique(length(sy))
    if (lsy < minScores)
        return (NA)
    df <- data.frame(z = c(am),
                     sx = rep(sx, ncol(am)),
                     sy = rep(sy, each = nrow(am)))
    newdata <- df[IndexTwoOne(attr(am, "mp"), dim(am)),]
    m <- lm(z ~ sx * sy, data = df)
    predict(m, newdata = newdata)
}

## test PredictLm() by running it
## manually for one missing value
mp <- IndexOneFour(which(is.na(ndvi))[1], dim(ndvi))
a <- Subset(data = ndvi, mp = mp, i = 0)
PredictLm(a = a, i = 0)

## run PredictLm() on ndvi data
out5 <- Gapfill(data = ndvi, fnPredict = PredictLm,
                nNotNA = 50)
Image(out5$fill)

## End(Not run)

Results