Keep or drop rows that match a condition — filter.tbl

These are methods for the dplyr dplyr::filter() and dplyr::filter_out() generics. They generate the WHERE clause of the SQL query.

filter() is translated directly to WHERE, which already matches dplyr's behaviour of treating NA like FALSE (SQL's three-valued logic drops NULL rows from WHERE).

filter_out() requires an additional step, where the combined condition is wrapped in is_distinct_from(., TRUE), which is then translated using the backend (e.g. to IS DISTINCT FROM on PostgreSQL, IS NOT on SQLite). This ensures that the SQL translation matches dplyr's semantics.

Usage

# S3 method for class 'tbl_lazy'
filter(.data, ..., .by = NULL, .preserve = FALSE)

# S3 method for class 'tbl_lazy'
filter_out(.data, ..., .by = NULL, .preserve = FALSE)

Arguments

.data: A lazy data frame backed by a database query.
...: <data-masking> Variables, or functions of variables. Use desc() to sort a variable in descending order.
.by: <tidy-select> Optionally, a selection of columns to group by for just this operation, functioning as an alternative to group_by(). For details and examples, see ?dplyr_by.
.preserve: Not supported by this method.

Value

Another tbl_lazy. Use show_query() to see the generated query, and use collect() to execute the query and return data to R.

Examples

library(dplyr, warn.conflicts = FALSE)

db <- memdb_frame(x = c(2, NA, 5, NA, 10), y = 1:5)
db |> filter(x < 5) |> show_query()
#> <SQL>
#> SELECT *
#> FROM `dbplyr_tmp_Pg2ZRmBV2z`
#> WHERE (`x` < 5.0)
db |> filter_out(x < 5) |> show_query()
#> <SQL>
#> SELECT *
#> FROM `dbplyr_tmp_Pg2ZRmBV2z`
#> WHERE ((`x` < 5.0) IS NOT (1))
db |> filter(is.na(x)) |> show_query()
#> <SQL>
#> SELECT *
#> FROM `dbplyr_tmp_Pg2ZRmBV2z`
#> WHERE ((`x` IS NULL))