Last data update: 2014.03.03

R: Robust fit of a Beta distribution using CvM distance...
betaCvMfitR Documentation

Robust fit of a Beta distribution using CvM distance minimization

Description

Robustly fits a Beta distribution to data using Cramér-von-Mises (CvM) distance minimization.

Usage

betaCvMfit(data, CvM = TRUE, rob = TRUE)

Arguments

data

numeric vector: The sample, a Beta distribution is fitted to.

CvM

logical: If FALSE the Cramér-von-Mises-distance is not minimized, but only moment estimates for the parameters of the Beta distribution are returned (see Details).

rob

logical: If TRUE, mean and standard deviation are replaced by median and MAD when calculating moment estimates for the parameters of the Beta distribution (see Details).

Details

betaCvMfit fits a Beta distribution to data by minimizing the Cramér-von-Mises distance. Moment estimates of the parameters of the Beta distribution, clipped to positive values, are used as starting values for the optimization process. They are calculated using

a_est = - mean(x)*(-mean(x)+mean(x)^2+var(x))/var(x)

b_est=(a_est-a_est*mean(x))/mean(x)

These clipped moment estimates can be returned instead of CvM-fitted parameters setting CvM = FALSE.

The Cramér-von-Mises distance is defined as (see Clarke, McKinnon and Riley 2012)

1/n*∑(F(u[(i)])-(i-0.5)/n)^2+1/(12n^2)

where u[(1)], …, u[(n)] is the ordered sample and F the distribution function of Beta(a,b).

Value

numeric vector: Estimates for the Parameters a,b of a Beta(a,b) distribution with mean a/(a+b).

Note

Adapted from R-Code from Brenton R. Clarke to fit a Gamma distribution (see Clarke, McKinnon and Riley 2012) using Cramér-von-Mises distance minimization. Used in Thieler et al. (2013). See also Thieler, Fried and Rathjens (2016).

Author(s)

Anita M. Thieler, with contributions from Brenton R. Clarke.

References

Clarke, B. R., McKinnon, P. L. and Riley, G. (2012): A Fast Robust Method for Fitting Gamma Distributions. Statistical Papers, 53 (4), 1001-1014

Thieler, A. M., Backes, M., Fried, R. and Rhode, W. (2013): Periodicity Detection in Irregularly Sampled Light Curves by Robust Regression and Outlier Detection. Statistical Analysis and Data Mining, 6 (1), 73-89

Thieler, A. M., Fried, R. and Rathjens, J. (2016): RobPer: An R Package to Calculate Periodograms for Light Curves Based on Robust Regression. Journal of Statistical Software, 69 (9), 1-36, <doi:10.18637/jss.v069.i09>

See Also

See RobPer-package for an example applying betaCvMfit to detect valid periods in a periodogram.

Examples

# data:
set.seed(12)
PP <- c(rbeta(45, shape1=4, shape2=15), runif(5, min=0.8, max=1))
hist(PP, freq=FALSE, breaks=30, ylim=c(0,7), xlab="Periodogram bar")

# true parameters:
myf.true <- function(x) dbeta(x, shape1=4, shape2=15)
curve(myf.true, add=TRUE, lwd=2)

# method of moments:
par.mom <- betaCvMfit(PP, rob=FALSE, CvM=FALSE)
myf.mom <- function(x) dbeta(x, shape1=par.mom[1], shape2=par.mom[2])
curve(myf.mom, add=TRUE, lwd=2, col="red")

# robust method of moments
par.rob <- betaCvMfit(PP, rob=TRUE, CvM=FALSE)
myf.rob <- function(x) dbeta(x, shape1=par.rob[1], shape2=par.rob[2])
curve(myf.rob, add=TRUE, lwd=2, col="blue")

# CvM distance minimization
par.CvM <- betaCvMfit(PP, rob=TRUE, CvM=TRUE)
myf.CvM <- function(x) dbeta(x, shape1=par.CvM[1], shape2=par.CvM[2])
curve(myf.CvM, add=TRUE, lwd=2, col="green")

# Searching for outliers...
abline(v=qbeta((0.95)^(1/50), shape1=par.CvM[1], shape2=par.CvM[2]), col="green")

legend("topright", fill=c("black", "green","blue", "red"),
    legend=c("true", "CvM", "robust moments", "moments"))
box()

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(RobPer)
Loading required package: robustbase
Loading required package: quantreg
Loading required package: SparseM

Attaching package: 'SparseM'

The following object is masked from 'package:base':

    backsolve

Loading required package: splines
Loading required package: BB
Loading required package: rgenoud
##  rgenoud (Version 5.7-12.4, Build Date: 2015-07-19)
##  See http://sekhon.berkeley.edu/rgenoud for additional documentation.
##  Please cite software as:
##   Walter Mebane, Jr. and Jasjeet S. Sekhon. 2011.
##   ``Genetic Optimization Using Derivatives: The rgenoud package for R.''
##   Journal of Statistical Software, 42(11): 1-26. 
##

> png(filename="/home/ddbj/snapshot/RGM3/R_CC/result/RobPer/betaCvMfit.Rd_%03d_medium.png", width=480, height=480)
> ### Name: betaCvMfit
> ### Title: Robust fit of a Beta distribution using CvM distance
> ###   minimization
> ### Aliases: betaCvMfit
> 
> ### ** Examples
> 
> # data:
> set.seed(12)
> PP <- c(rbeta(45, shape1=4, shape2=15), runif(5, min=0.8, max=1))
> hist(PP, freq=FALSE, breaks=30, ylim=c(0,7), xlab="Periodogram bar")
> 
> # true parameters:
> myf.true <- function(x) dbeta(x, shape1=4, shape2=15)
> curve(myf.true, add=TRUE, lwd=2)
> 
> # method of moments:
> par.mom <- betaCvMfit(PP, rob=FALSE, CvM=FALSE)
> myf.mom <- function(x) dbeta(x, shape1=par.mom[1], shape2=par.mom[2])
> curve(myf.mom, add=TRUE, lwd=2, col="red")
> 
> # robust method of moments
> par.rob <- betaCvMfit(PP, rob=TRUE, CvM=FALSE)
> myf.rob <- function(x) dbeta(x, shape1=par.rob[1], shape2=par.rob[2])
> curve(myf.rob, add=TRUE, lwd=2, col="blue")
> 
> # CvM distance minimization
> par.CvM <- betaCvMfit(PP, rob=TRUE, CvM=TRUE)
> myf.CvM <- function(x) dbeta(x, shape1=par.CvM[1], shape2=par.CvM[2])
> curve(myf.CvM, add=TRUE, lwd=2, col="green")
> 
> # Searching for outliers...
> abline(v=qbeta((0.95)^(1/50), shape1=par.CvM[1], shape2=par.CvM[2]), col="green")
> 
> legend("topright", fill=c("black", "green","blue", "red"),
+     legend=c("true", "CvM", "robust moments", "moments"))
> box()
> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>