Skip to content

See vignette("translation-function") and vignette("translation-verb") for details of overall translation technology. Key differences for this backend are better translation of statistical aggregate functions (e.g. var(), median()) and use of temporary views instead of temporary tables when copying data.

Use simulate_spark_sql() with lazy_frame() to see simulated SQL without converting to live access database.

Usage

simulate_spark_sql()

Examples

library(dplyr, warn.conflicts = FALSE)

lf <- lazy_frame(a = TRUE, b = 1, d = 2, c = "z", con = simulate_spark_sql())

lf %>% summarise(x = median(d, na.rm = TRUE))
#> <SQL>
#> SELECT MEDIAN(`d`) AS `x`
#> FROM `df`
lf %>% summarise(x = var(c, na.rm = TRUE), .by = d)
#> <SQL>
#> SELECT `d`, VARIANCE(`c`) AS `x`
#> FROM `df`
#> GROUP BY `d`

lf %>% mutate(x = first(c))
#> <SQL>
#> SELECT `df`.*, FIRST_VALUE(`c`) OVER () AS `x`
#> FROM `df`
lf %>% mutate(x = first(c), .by = d)
#> <SQL>
#> SELECT `df`.*, FIRST_VALUE(`c`) OVER (PARTITION BY `d`) AS `x`
#> FROM `df`