Skip to content
Advertisement

“Flattening” complex sequences of related rows in SQL

I have a table with events describing user navigation through flows, with this general shape (disregard the timestamps, I smashed some keys):

After the option_a event, users can go back and choose another option:

Users can also go through a flow multiple times:

I want to produce a table where each row represents a flow, from start to end, including the selected option along the way. For example:

I need to avoid:

  1. False results from the back event changing the option mid-flow.
  2. False results from mixing up two different instances of the flow.
  3. False results from mixing up two different users.

I have no idea how to approach this. If it’s any difference, I’m using BigQuery, which has some differences with standard SQL and may have some specific functions.

Advertisement

Answer

You can add up the starts to assign a grouping. Then you want the last option in each group:

This makes some assumptions about the data, notably:

  • 'start' and 'stop' are lined up.
  • you want the most recent non-null option_choice.
User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement