Skip to content
Advertisement

calculation with undefined number of rows

I would like to add a calculation taking into account we could have one, two, three or a lot of more rows. We can get all the data using the relations but I am a bit stuck because the number of relation are undefined. For example, as a source:

SELECT 123 AS id
      ,250 AS amount
      ,225 AS debt
      ,NULL AS relation
      ,1 AS rn
UNION ALL    
SELECT 124 AS id
      ,150 AS amount
      ,25 AS debt
      ,123 AS relation
      ,2 AS rn
UNION ALL   
SELECT 125 AS id
      ,160 AS amount
      ,50.25 AS debt
      ,124 AS relation
      ,3 AS rn
UNION ALL
SELECT 126 AS id
      ,80 AS amount
      ,25 AS debt
      ,125 AS relation
      ,4 AS rn 

Source Table

id amount debt relation rn
123 250 225 NULL 1
124 150 25 123 2
125 160 50.25 124 3
126 80 25 125 4

End Table

id amount debt relation rn cal
123 250 225 NULL 1 250
124 150 25 123 2 22.5
125 160 50.25 124 3 7.5375
126 80 25 125 4 3.75

I would need to apply a calculation like:

  • row 1: amount
  • row 2: row2.debt*row1.debt/row1.amount
  • row 3: row3.debt*row2.debt/row2.amount*row1.debt/row1.amount
  • row 4: row4.debt*row3.debt/row3.debt*row2.debt/row2.amount*row1.debt/row1.amount
  • etc..

I am using dbt but happy to hear about BigQuery or other SQL as I am really curious how this could be done.

Advertisement

Answer

A little math can help here – see below

select *,
  ifnull(
    round(exp(sum(ln(debt)) over(order by id rows between unbounded preceding and current row))
    / exp(sum(ln(amount)) over(order by id rows between unbounded preceding and 1 preceding)), 2),
    amount
  ) cal
from `project.dataset.table`  

if applied to sample data in your question – output is

enter image description here

Note: I am using order by id assuming that id column defines order of rows. You can adjust it as needed

User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement