R Graphical Manual

Browse All

Last data update: 2014.03.03

R: Group a tbl by one or more variables.

group_by

R Documentation

Group a tbl by one or more variables.

Description

Most data operations are useful done on groups defined by variables in the the dataset. The group_by function takes an existing tbl and converts it into a grouped tbl where operations are performed "by group".

Usage

group_by(.data, ..., add = FALSE)

group_by_(.data, ..., .dots, add = FALSE)

Arguments

`.data`	a tbl
`...`	variables to group by. All tbls accept variable names, some will also accept functions of variables. Duplicated groups will be silently dropped.
`add`	By default, when `add = FALSE`, `group_by` will override existing groups. To instead add to the existing groups, use `add = TRUE`
`.dots`	Used to work around non-standard evaluation. See `vignette("nse")` for details.

Tbl types

group_by is an S3 generic with methods for the three built-in tbls. See the help for the corresponding classes and their manip methods for more details:

data.frame: grouped_df
data.table: grouped_dt
SQLite: src_sqlite
PostgreSQL: src_postgres
MySQL: src_mysql

Examples

by_cyl <- group_by(mtcars, cyl)
summarise(by_cyl, mean(disp), mean(hp))
filter(by_cyl, disp == max(disp))

# summarise peels off a single layer of grouping
by_vs_am <- group_by(mtcars, vs, am)
by_vs <- summarise(by_vs_am, n = n())
by_vs
summarise(by_vs, n = sum(n))
# use ungroup() to remove if not wanted
summarise(ungroup(by_vs), n = sum(n))

# You can group by expressions: this is just short-hand for
# a mutate/rename followed by a simple group_by
group_by(mtcars, vsam = vs + am)
group_by(mtcars, vs2 = vs)

# You can also group by a constant, but it's not very useful
group_by(mtcars, "vs")

# By default, group_by sets groups. Use add = TRUE to add groups
groups(group_by(by_cyl, vs, am))
groups(group_by(by_cyl, vs, am, add = TRUE))

# Duplicate groups are silently dropped
groups(group_by(by_cyl, cyl, cyl))