i have a set of data sample like this
| id | month | x | jan | x | feb | x | mar
is there any way i can populate that sample to be like this?
| id | month | number | x | jan | 1 | x | feb | 2 | x | mar | 3 | x | feb | 1 | x | mar | 2 | x | mar | 1
this work like union all but excluding the first month each loops
Advertisement
Answer
Below is for BigQuery Standard SQL
#standardSQL WITH `project.dataset.table` AS ( SELECT 1 id, 'jan' month, 1 pos UNION ALL SELECT 1, 'feb', 2 UNION ALL SELECT 1, 'mar', 3 ) SELECT id, month, pos - MIN(pos) OVER(PARTITION BY id, num) + 1 AS number FROM `project.dataset.table`, UNNEST(GENERATE_ARRAY(1, pos)) AS num -- ORDER BY num
with output
Row id month number 1 1 jan 1 2 1 feb 2 3 1 mar 3 4 1 feb 1 5 1 mar 2 6 1 mar 1
In case if pos
field is not explicitly available – you can derive it as in below example
#standardSQL WITH `project.dataset.table` AS ( SELECT 1 id, 'jan' month UNION ALL SELECT 1, 'feb' UNION ALL SELECT 1, 'mar' ), temp AS ( SELECT id, month, EXTRACT(MONTH FROM PARSE_DATE('%b', month)) pos FROM `project.dataset.table` ) SELECT id, month, pos - MIN(pos) OVER(PARTITION BY id, num) + 1 AS number FROM temp, UNNEST(GENERATE_ARRAY(1, pos)) AS num -- ORDER BY num
with the same final output