Sort by one variable, group by another, and select first row in SQL Query in R

Question

I need to apply a procedure in SQL that is easy for me since R, but has been really tortuous in SQL. I need to sort the data from highest to lowest by two variables, group based on another variable, and select the first item in each group. I leave the code that I am trying to pass from R

Accepted Answer

Maybe the following?library(tidyverse)library(dbplyr)con <- DBI::dbConnect(RSQLite::SQLite(), ":memory:")copy_to(con, mtcars)mtcars2 <- tbl(con, "mtcars")mtcars2 %>% arrange(-mpg,-disp) %>% group_by(cyl) %>% mutate(hp = first(hp)) %>% select(cyl, hp) %>% distinct %>% show_query#> #> SELECT DISTINCT `cyl`, FIRST_VALUE(`hp`) OVER (PARTITION BY `cyl` ORDER BY -`mpg`, -`disp`) AS `hp`#> FROM `mtcars`#> ORDER BY -`mpg`, -`disp`See: https://github.com/tidyverse/dbplyr/issues/129

Advertisement

Answer