Skip to content
Advertisement

How do I do conditional logic between rows of a bigquery table?

I’m trying to write a query that goes through a table row by row comparing the current row with the next. Then based on a condition being true will perform a calculation which is then output in a column on the same table and a null value if false.

Current table

Consider the example above:

  • Row 8703 will be referred to as Row 1
  • Row 8704 will be referred to as Row 2

I would like to, if possible, compare Row 1 bookedEnd with Row 2 bookedStart. If they are of equal value (which in this case they are) I would like to subtract Row 2 actualStartdate from Row 1 actualEnddate and output the value in minutes in a separate column named ‘difference’ on Row 2. If they are not of equal value (which is true for all other columns in the example above) I would like to output a null value.

For the above table the extra column named difference would have the row values of:

  • 8701 – Null
  • 8702 – Null
  • 8703 – Null
  • 8704 – 12
  • 8705 – Null

Advertisement

Answer

Since you are writing to “Row 2”, I use the LAG() function so you are comparing on the row you are writing.

with data as (select * from `project.dataset.table`),
lagged as (
  select 
    *,
    lag(bookedEnd,1) over(partition by roomID order by Row asc) as prev_bookedEnd,
    lag(actualEnddate,1) over(partition by roomID order by Row asc) as prev_actualEnddate
  from data
)
select 
  * except (prev_bookedEnd,prev_actualEnddate),
  case when prev_bookedEnd = bookedStart then timestamp_diff(prev_actualEndDate,actualStartdate, minute) else null end as difference
from lagged

User contributions licensed under: CC BY-SA
2 People found this is helpful
Advertisement