Last data update: 2014.03.03

R: Cut a Numeric Variable into Intervals
cut2R Documentation

Cut a Numeric Variable into Intervals

Description

Function like cut but left endpoints are inclusive and labels are of the form [lower, upper), except that last interval is [lower,upper]. If cuts are given, will by default make sure that cuts include entire range of x. Also, if cuts are not given, will cut x into quantile groups (g given) or groups with a given minimum number of observations (m). Whereas cut creates a category object, cut2 creates a factor object.

Usage

cut2(x, cuts, m, g, levels.mean, digits, minmax=TRUE, oneval=TRUE, onlycuts=FALSE)

Arguments

x

numeric vector to classify into intervals

cuts

cut points

m

desired minimum number of observations in a group. The algorithm does not guarantee that all groups will have at least m observations.

g

number of quantile groups

levels.mean

set to TRUE to make the new categorical vector have levels attribute that is the group means of x instead of interval endpoint labels

digits

number of significant digits to use in constructing levels. Default is 3 (5 if levels.mean=TRUE)

minmax

if cuts is specified but min(x)<min(cuts) or max(x)>max(cuts), augments cuts to include min and max x

oneval

if an interval contains only one unique value, the interval will be labeled with the formatted version of that value instead of the interval endpoints, unless oneval=FALSE

onlycuts

set to TRUE to only return the vector of computed cuts. This consists of the interior values plus outer ranges.

Value

a factor variable with levels of the form [a,b) or formatted means (character strings) unless onlycuts is TRUE in which case a numeric vector is returned

See Also

cut, quantile

Examples

set.seed(1)
x <- runif(1000, 0, 100)
z <- cut2(x, c(10,20,30))
table(z)
table(cut2(x, g=10))      # quantile groups
table(cut2(x, m=50))      # group x into intevals with at least 50 obs.

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(Hmisc)
Loading required package: lattice
Loading required package: survival
Loading required package: Formula
Loading required package: ggplot2

Attaching package: 'Hmisc'

The following objects are masked from 'package:base':

    format.pval, round.POSIXt, trunc.POSIXt, units

> png(filename="/home/ddbj/snapshot/RGM3/R_CC/result/Hmisc/cut2.Rd_%03d_medium.png", width=480, height=480)
> ### Name: cut2
> ### Title: Cut a Numeric Variable into Intervals
> ### Aliases: cut2
> ### Keywords: category nonparametric
> 
> ### ** Examples
> 
> set.seed(1)
> x <- runif(1000, 0, 100)
> z <- cut2(x, c(10,20,30))
> table(z)
z
[  0.131, 10.000) [ 10.000, 20.000) [ 20.000, 30.000) [ 30.000, 99.993] 
               96               104                93               707 
> table(cut2(x, g=10))      # quantile groups

[ 0.131, 10.5) [10.505, 20.2) [20.168, 31.2) [31.204, 39.8) [39.784, 48.4) 
           100            100            100            100            100 
[48.435, 59.6) [59.645, 70.7) [70.666, 79.7) [79.731, 91.0) [91.037,100.0] 
           100            100            100            100            100 
> table(cut2(x, m=50))      # group x into intevals with at least 50 obs.

[ 0.131,  5.52) [ 5.516, 10.51) [10.505, 15.48) [15.483, 20.17) [20.168, 25.82) 
             50              50              50              50              50 
[25.817, 31.20) [31.204, 35.32) [35.320, 39.78) [39.784, 44.15) [44.146, 48.43) 
             50              50              50              50              50 
[48.435, 52.78) [52.778, 59.64) [59.645, 65.09) [65.087, 70.67) [70.666, 74.76) 
             50              50              50              50              50 
[74.764, 79.73) [79.731, 85.51) [85.508, 91.04) [91.037, 95.37) [95.373, 99.99] 
             50              50              50              50              50 
> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>