R Graphical Manual

Browse All

Last data update: 2014.03.03

R: Multiple empirical cumulative distribution functions (ecdf)...

multiecdf

R Documentation

Multiple empirical cumulative distribution functions (ecdf) and densities

Description

Plot multiple empirical cumulative distribution functions (ecdf) and densities with a user interface similar to that of boxplot. The usefulness of multidensity is variable, depending on the data and the smoothing kernel. multiecdf will in many cases be preferable. Please see Details.

Usage

multiecdf(x, ...)
## S3 method for class 'formula'
multiecdf(formula, data = NULL, xlab, na.action = NULL, ...)
## S3 method for class 'matrix'
multiecdf(x, xlab, ...) 
## S3 method for class 'list'
multiecdf(x,
          xlim,
          col = brewer.pal(9, "Set1"),
          main = "ecdf",
          xlab,
          do.points = FALSE,
          subsample = 1000L,
          legend = list(
            x = "right",
            legend = if(is.null(names(x))) paste(seq(along=x)) else names(x),
            fill = col),
          ...)

multidensity(x, ...)
## S3 method for class 'formula'
multidensity(formula, data = NULL, xlab, na.action = NULL, ...)
## S3 method for class 'matrix'
multidensity(x, xlab, ...) 
## S3 method for class 'list'
multidensity(x,
             bw = "nrd0",
             xlim,
             ylim,
             col  = brewer.pal(9, "Set1"),
             main = if(length(x)==1) "density" else "densities",
             xlab,
             lty  = 1L,
             legend = list(
               x = "topright",
               legend = if(is.null(names(x))) paste(seq(along=x)) else names(x),
               fill = col),
             density = NULL,
             ...)

Arguments

`formula`	a formula, such as `y ~ grp`, where `y` is a numeric vector of data values to be split into groups according to the grouping variable `grp` (usually a factor).
`data`	a data.frame (or list) from which the variables in `formula` should be taken.
`na.action`	a function which indicates what should happen when the data contain `NA`s. The default is to ignore missing values in either the response or the group.
`x`	methods exist for: `formula`, `matrix`, `data.frame`, `list` of numeric vectors.
`bw`	the smoothing bandwidth, see the manual page for `density`. The length of `bw` needs to be either 1 (in which case the same is used for all groups) or the same as the number of groups in `x` (in which case the corresponding value of `bw` is used for each group).
`xlim`	Range of the x axis. If missing, the data range is used.
`ylim`	Range of the y axis. If missing, the range of the density estimates is used.
`col, lty`	Line colors and line type.
`main`	Plot title.
`xlab`	x-axis label.
`do.points`	logical; if `TRUE`, also draw points at the knot locations.
`subsample`	numeric or logical of length 1. If numeric, and larger than 0, subsamples of that size are used to compute and plot the ecdf for those elements of `x` with more than that number of observations. If logical and `TRUE`, a value of 1000 is used for the subsample size.
`legend`	a list of arguments that is passed to the function `legend`.
`density`	a list of arguments that is passed to the function `density`.
`...`	Further arguments that get passed to the `plot` functions.

Details

Density estimates: multidensity uses the function density. If the density of the data-generating process is smooth on the real axis, then the output from this function tends to produce results that are good approximations of the true density. If, however, the true density has steps (this is in particular the case for quantities such as p-values and correlation coefficients, or for some distributions that have weight only on the posititve numbers, or only on integer numbers), then the output of this function tends to be misleading. In that case, please either use multiecdf or histograms, or try to improve the density estimate by setting the density argument (from, to, kernel).

Bandwidths: the choice of the smoothing bandwidths in multidensity can be problematic, in particular, if the different groups vary with respect to range and/or number of data points. If curves look excessively wiggly or overly smooth, try varying the arguments xlim and bw; note that the argument bw can be a vector, in which case it is expect to align with the groups.

Value

For the multidensity functions, a list of density objects.

Author(s)

Wolfgang Huber

Examples

  words = strsplit(packageDescription("geneplotter")$Description, " ")[[1]]
  factr = factor(sample(words, 2000, replace = TRUE))
  x = rnorm(length(factr), mean=as.integer(factr))
  
  multiecdf(x ~ factr)
  multidensity(x ~ factr)

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(geneplotter)
Loading required package: Biobase
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: 'BiocGenerics'

The following objects are masked from 'package:parallel':

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from 'package:stats':

    IQR, mad, xtabs

The following objects are masked from 'package:base':

    Filter, Find, Map, Position, Reduce, anyDuplicated, append,
    as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
    get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
    match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
    rbind, rownames, sapply, setdiff, sort, table, tapply, union,
    unique, unsplit

Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.

Loading required package: lattice
Loading required package: annotate
Loading required package: AnnotationDbi
Loading required package: stats4
Loading required package: IRanges
Loading required package: S4Vectors

Attaching package: 'S4Vectors'

The following objects are masked from 'package:base':

    colMeans, colSums, expand.grid, rowMeans, rowSums

Loading required package: XML
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/geneplotter/multiecdf.Rd_%03d_medium.png", width=480, height=480)
> ### Name: multiecdf
> ### Title: Multiple empirical cumulative distribution functions (ecdf) and
> ###   densities
> ### Aliases: multiecdf multiecdf.list multiecdf.formula multiecdf.matrix
> ###   multidensity multidensity.list multidensity.formula
> ###   multidensity.matrix
> ### Keywords: hplot
> 
> ### ** Examples
> 
>   words = strsplit(packageDescription("geneplotter")$Description, " ")[[1]]
>   factr = factor(sample(words, 2000, replace = TRUE))
>   x = rnorm(length(factr), mean=as.integer(factr))
>   
>   multiecdf(x ~ factr)
>   multidensity(x ~ factr)
> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>