Last data update: 2014.03.03

R: Multiple empirical cumulative distribution functions (ecdf)...
multiecdfR Documentation

Multiple empirical cumulative distribution functions (ecdf) and densities

Description

Plot multiple empirical cumulative distribution functions (ecdf) and densities with a user interface similar to that of boxplot. The usefulness of multidensity is variable, depending on the data and the smoothing kernel. multiecdf will in many cases be preferable. Please see Details.

Usage

multiecdf(x, ...)
## S3 method for class 'formula'
multiecdf(formula, data = NULL, xlab, na.action = NULL, ...)
## S3 method for class 'matrix'
multiecdf(x, xlab, ...) 
## S3 method for class 'list'
multiecdf(x,
          xlim,
          col = brewer.pal(9, "Set1"),
          main = "ecdf",
          xlab,
          do.points = FALSE,
          subsample = 1000L,
          legend = list(
            x = "right",
            legend = if(is.null(names(x))) paste(seq(along=x)) else names(x),
            fill = col),
          ...)

multidensity(x, ...)
## S3 method for class 'formula'
multidensity(formula, data = NULL, xlab, na.action = NULL, ...)
## S3 method for class 'matrix'
multidensity(x, xlab, ...) 
## S3 method for class 'list'
multidensity(x,
             bw = "nrd0",
             xlim,
             ylim,
             col  = brewer.pal(9, "Set1"),
             main = if(length(x)==1) "density" else "densities",
             xlab,
             lty  = 1L,
             legend = list(
               x = "topright",
               legend = if(is.null(names(x))) paste(seq(along=x)) else names(x),
               fill = col),
             density = NULL,
             ...)

Arguments

formula

a formula, such as y ~ grp, where y is a numeric vector of data values to be split into groups according to the grouping variable grp (usually a factor).

data

a data.frame (or list) from which the variables in formula should be taken.

na.action

a function which indicates what should happen when the data contain NAs. The default is to ignore missing values in either the response or the group.

x

methods exist for: formula, matrix, data.frame, list of numeric vectors.

bw

the smoothing bandwidth, see the manual page for density. The length of bw needs to be either 1 (in which case the same is used for all groups) or the same as the number of groups in x (in which case the corresponding value of bw is used for each group).

xlim

Range of the x axis. If missing, the data range is used.

ylim

Range of the y axis. If missing, the range of the density estimates is used.

col, lty

Line colors and line type.

main

Plot title.

xlab

x-axis label.

do.points

logical; if TRUE, also draw points at the knot locations.

subsample

numeric or logical of length 1. If numeric, and larger than 0, subsamples of that size are used to compute and plot the ecdf for those elements of x with more than that number of observations. If logical and TRUE, a value of 1000 is used for the subsample size.

legend

a list of arguments that is passed to the function legend.

density

a list of arguments that is passed to the function density.

...

Further arguments that get passed to the plot functions.

Details

Density estimates: multidensity uses the function density. If the density of the data-generating process is smooth on the real axis, then the output from this function tends to produce results that are good approximations of the true density. If, however, the true density has steps (this is in particular the case for quantities such as p-values and correlation coefficients, or for some distributions that have weight only on the posititve numbers, or only on integer numbers), then the output of this function tends to be misleading. In that case, please either use multiecdf or histograms, or try to improve the density estimate by setting the density argument (from, to, kernel).

Bandwidths: the choice of the smoothing bandwidths in multidensity can be problematic, in particular, if the different groups vary with respect to range and/or number of data points. If curves look excessively wiggly or overly smooth, try varying the arguments xlim and bw; note that the argument bw can be a vector, in which case it is expect to align with the groups.

Value

For the multidensity functions, a list of density objects.

Author(s)

Wolfgang Huber

See Also

boxplot, ecdf, density

Examples

  words = strsplit(packageDescription("geneplotter")$Description, " ")[[1]]
  factr = factor(sample(words, 2000, replace = TRUE))
  x = rnorm(length(factr), mean=as.integer(factr))
  
  multiecdf(x ~ factr)
  multidensity(x ~ factr)

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(geneplotter)
Loading required package: Biobase
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: 'BiocGenerics'

The following objects are masked from 'package:parallel':

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from 'package:stats':

    IQR, mad, xtabs

The following objects are masked from 'package:base':

    Filter, Find, Map, Position, Reduce, anyDuplicated, append,
    as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
    get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
    match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
    rbind, rownames, sapply, setdiff, sort, table, tapply, union,
    unique, unsplit

Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.

Loading required package: lattice
Loading required package: annotate
Loading required package: AnnotationDbi
Loading required package: stats4
Loading required package: IRanges
Loading required package: S4Vectors

Attaching package: 'S4Vectors'

The following objects are masked from 'package:base':

    colMeans, colSums, expand.grid, rowMeans, rowSums

Loading required package: XML
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/geneplotter/multiecdf.Rd_%03d_medium.png", width=480, height=480)
> ### Name: multiecdf
> ### Title: Multiple empirical cumulative distribution functions (ecdf) and
> ###   densities
> ### Aliases: multiecdf multiecdf.list multiecdf.formula multiecdf.matrix
> ###   multidensity multidensity.list multidensity.formula
> ###   multidensity.matrix
> ### Keywords: hplot
> 
> ### ** Examples
> 
>   words = strsplit(packageDescription("geneplotter")$Description, " ")[[1]]
>   factr = factor(sample(words, 2000, replace = TRUE))
>   x = rnorm(length(factr), mean=as.integer(factr))
>   
>   multiecdf(x ~ factr)
>   multidensity(x ~ factr)
> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>