Skip to content
Advertisement

Creating a cumulative sum column with_order in R

I’m working alongside a SQL tutorial using queryparser and tidyquery in R. This has been going well until I was asked to do:

Tidyquery reported that it did not support OVER functions so I am trying to replicate the OVER (PARTITION BY...) function with dplyr.

This led me to with_order(order_by =... in dplyr. Now I’m struggling with getting the fun = to allow me to create a cumulative sum column.

This gives me the error

Am I looking down the wrong rabbit hole when it comes to how to recreating OVER(PARTITION BY...)? If so what is a better option? Or am I missing how to properly use with_order(order_by =...)?

If it is not clear, my goal is to create a new column that keeps a running total of vaccinations for each separate location.

Advertisement

Answer

The PARTITION BY aspect of SQL can often be done in dplyr using group_by.

And the ORDER BY aspect of SQL can often be done in dplyr using arrange.

Consider this R code:

Is equivalent to this SQL:

User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement