i have a set of data sample like this
x
| id | month
| x | jan
| x | feb
| x | mar
is there any way i can populate that sample to be like this?
| id | month | number
| x | jan | 1
| x | feb | 2
| x | mar | 3
| x | feb | 1
| x | mar | 2
| x | mar | 1
this work like union all but excluding the first month each loops
Advertisement
Answer
Below is for BigQuery Standard SQL
#standardSQL
WITH `project.dataset.table` AS (
SELECT 1 id, 'jan' month, 1 pos UNION ALL
SELECT 1, 'feb', 2 UNION ALL
SELECT 1, 'mar', 3
)
SELECT id, month,
pos - MIN(pos) OVER(PARTITION BY id, num) + 1 AS number
FROM `project.dataset.table`,
UNNEST(GENERATE_ARRAY(1, pos)) AS num
-- ORDER BY num
with output
Row id month number
1 1 jan 1
2 1 feb 2
3 1 mar 3
4 1 feb 1
5 1 mar 2
6 1 mar 1
In case if pos
field is not explicitly available – you can derive it as in below example
#standardSQL
WITH `project.dataset.table` AS (
SELECT 1 id, 'jan' month UNION ALL
SELECT 1, 'feb' UNION ALL
SELECT 1, 'mar'
), temp AS (
SELECT id, month, EXTRACT(MONTH FROM PARSE_DATE('%b', month)) pos
FROM `project.dataset.table`
)
SELECT id, month,
pos - MIN(pos) OVER(PARTITION BY id, num) + 1 AS number
FROM temp,
UNNEST(GENERATE_ARRAY(1, pos)) AS num
-- ORDER BY num
with the same final output