See vignette("translation-function") and vignette("translation-verb") for
details of overall translation technology. Key differences for this backend
are better translation of statistical aggregate functions
(e.g. var(), median()) and use of temporary views instead of temporary
tables when copying data.
Use simulate_spark_sql() with lazy_frame() to see simulated SQL without
converting to live access database.
Examples
library(dplyr, warn.conflicts = FALSE)
lf <- lazy_frame(a = TRUE, b = 1, d = 2, c = "z", con = simulate_spark_sql())
lf %>% summarise(x = median(d, na.rm = TRUE))
#> <SQL>
#> SELECT MEDIAN(`d`) AS `x`
#> FROM `df`
lf %>% summarise(x = var(c, na.rm = TRUE), .by = d)
#> <SQL>
#> SELECT `d`, VARIANCE(`c`) AS `x`
#> FROM `df`
#> GROUP BY `d`
lf %>% mutate(x = first(c))
#> <SQL>
#> SELECT `df`.*, FIRST_VALUE(`c`) OVER () AS `x`
#> FROM `df`
lf %>% mutate(x = first(c), .by = d)
#> <SQL>
#> SELECT `df`.*, FIRST_VALUE(`c`) OVER (PARTITION BY `d`) AS `x`
#> FROM `df`
