These are methods for the dplyr generics slice_min(), slice_max(), and slice_sample(). They are translated to SQL using filter() and window functions (ROWNUMBER, MIN_RANK, or CUME_DIST depending on arguments). slice(), slice_head(), and slice_tail() are not supported since database tables have no intrinsic order.

If data is grouped, the operation will be performed on each group so that (e.g.) slice_min(db, x, n = 3) will select the three rows with the smallest value of x in each group.

# S3 method for tbl_lazy
slice_min(.data, order_by, ..., n, prop, with_ties = TRUE)

# S3 method for tbl_lazy
slice_max(.data, order_by, ..., n, prop, with_ties = TRUE)

# S3 method for tbl_lazy
slice_sample(.data, ..., n, prop, weight_by = NULL, replace = FALSE)

Arguments

.data

A lazy data frame backed by a database query.

order_by

Variable or function of variables to order by.

...

Not used.

n, prop

Provide either n, the number of rows, or prop, the proportion of rows to select. If neither are supplied, n = 1 will be used.

If n is greater than the number of rows in the group (or prop > 1), the result will be silently truncated to the group size. If the proportion of a group size is not an integer, it is rounded down.

with_ties

Should ties be kept together? The default, TRUE, may return more rows than you request. Use FALSE to ignore ties, and return the first n rows.

weight_by, replace

Not supported for database backends.

Examples

library(dplyr, warn.conflicts = FALSE) db <- memdb_frame(x = 1:3, y = c(1, 1, 2)) db %>% slice_min(x) %>% show_query()
#> <SQL> #> SELECT `x`, `y` #> FROM (SELECT `x`, `y`, RANK() OVER (ORDER BY `x`) AS `q01` #> FROM `dbplyr_004`) #> WHERE (`q01` <= 1)
db %>% slice_max(x) %>% show_query()
#> <SQL> #> SELECT `x`, `y` #> FROM (SELECT `x`, `y`, RANK() OVER (ORDER BY `x` DESC) AS `q01` #> FROM `dbplyr_004`) #> WHERE (`q01` <= 1)
db %>% slice_sample() %>% show_query()
#> <SQL> #> SELECT `x`, `y` #> FROM (SELECT `x`, `y`, ROW_NUMBER() OVER (ORDER BY random()) AS `q01` #> FROM `dbplyr_004`) #> WHERE (`q01` <= 1)
db %>% group_by(y) %>% slice_min(x) %>% show_query()
#> <SQL> #> SELECT `x`, `y` #> FROM (SELECT `x`, `y`, RANK() OVER (PARTITION BY `y` ORDER BY `x`) AS `q01` #> FROM `dbplyr_004`) #> WHERE (`q01` <= 1)
# By default, ties are includes so you may get more rows # than you expect db %>% slice_min(y, n = 1)
#> # Source: lazy query [?? x 2] #> # Database: sqlite 3.33.0 [:memory:] #> # Ordered by: y #> x y #> <int> <dbl> #> 1 1 1 #> 2 2 1
db %>% slice_min(y, n = 1, with_ties = FALSE)
#> # Source: lazy query [?? x 2] #> # Database: sqlite 3.33.0 [:memory:] #> # Ordered by: y #> x y #> <int> <dbl> #> 1 1 1
# Non-integer group sizes are rounded down db %>% slice_min(x, prop = 0.5)
#> # Source: lazy query [?? x 2] #> # Database: sqlite 3.33.0 [:memory:] #> # Ordered by: x #> x y #> <int> <dbl> #> 1 1 1