Skip to content
Advertisement

Count actions that occur only after a certain time from the previous action

I have a dataset in BigQuery where I want to count all non-duplicate actions. A duplicate action is one that occurs within a certain timeframe or ‘countdown’ from prior actions. Once the countdown reaches 0, the timer resets and the very next action is no longer considered to be duplicate.

In my example, the countdown timer starts at 10 seconds.

  • Sample code: http://sqlfiddle.com/#!17/2dd69/1
    • HitTime: Timestamp (seconds) for each row
    • DesiredOutput: What I am trying to create. Does not exist in my dataset
    • DesiredOutputExplation: An explanation to help explain the logic. Does not exist in my dataset

enter image description here

Advertisement

Answer

So I took the advice from shawnt00 and produced the following output in BigQuery.

While this does produce my desired output, I am unsure if there is a more effective way to do this (I’m pretty new to T-SQL). Especially considering my production dataset has ~800M rows in it.

User contributions licensed under: CC BY-SA
10 People found this is helpful
Advertisement