Group by one or more variables — group_by.tbl

This is a method for the dplyr group_by() generic. It is translated to the GROUP BY clause of the SQL query when used with summarise() and to the PARTITION BY clause of window functions when used with mutate().

Usage

# S3 method for tbl_lazy
group_by(.data, ..., .add = FALSE, add = deprecated(), .drop = TRUE)

Arguments

.data

A lazy data frame backed by a database query.

...

<data-masking> Variables, or functions of variables. Use desc() to sort a variable in descending order.

.add

When FALSE, the default, group_by() will override existing groups. To add to the existing groups, use .add = TRUE.

This argument was previously called add, but that prevented creating a new grouping variable called add, and conflicts with our naming conventions.

add

Deprecated. Please use .add instead.

.drop

Not supported by this method.

Examples

library(dplyr, warn.conflicts = FALSE)

db <- memdb_frame(g = c(1, 1, 1, 2, 2), x = c(4, 3, 6, 9, 2))
db %>%
  group_by(g) %>%
  summarise(n()) %>%
  show_query()
#> <SQL>
#> SELECT `g`, COUNT(*) AS `n()`
#> FROM `dbplyr_1qeoeBlLyC`
#> GROUP BY `g`

db %>%
  group_by(g) %>%
  mutate(x2 = x / sum(x, na.rm = TRUE)) %>%
  show_query()
#> <SQL>
#> SELECT `dbplyr_1qeoeBlLyC`.*, `x` / SUM(`x`) OVER (PARTITION BY `g`) AS `x2`
#> FROM `dbplyr_1qeoeBlLyC`