Skip to content
Advertisement

Is there any way i can populate entity based on months but excluding the first month each loops?

i have a set of data sample like this

| id | month
| x  | jan
| x  | feb
| x  | mar

is there any way i can populate that sample to be like this?

| id | month | number
| x  | jan   | 1
| x  | feb   | 2
| x  | mar   | 3
| x  | feb   | 1
| x  | mar   | 2
| x  | mar   | 1

this work like union all but excluding the first month each loops

Advertisement

Answer

Below is for BigQuery Standard SQL

#standardSQL
WITH `project.dataset.table` AS (
  SELECT 1 id, 'jan' month, 1 pos UNION ALL
  SELECT 1, 'feb', 2 UNION ALL
  SELECT 1, 'mar', 3 
)
SELECT id, month, 
  pos - MIN(pos) OVER(PARTITION BY id, num) + 1 AS number
FROM `project.dataset.table`,
UNNEST(GENERATE_ARRAY(1, pos)) AS num
-- ORDER BY num  

with output

Row id  month   number   
1   1   jan     1    
2   1   feb     2    
3   1   mar     3    
4   1   feb     1    
5   1   mar     2    
6   1   mar     1      

In case if pos field is not explicitly available – you can derive it as in below example

#standardSQL
WITH `project.dataset.table` AS (
  SELECT 1 id, 'jan' month UNION ALL
  SELECT 1, 'feb' UNION ALL
  SELECT 1, 'mar'
), temp AS (
  SELECT id, month, EXTRACT(MONTH FROM PARSE_DATE('%b', month)) pos
  FROM `project.dataset.table`
)
SELECT id, month, 
  pos - MIN(pos) OVER(PARTITION BY id, num) + 1 AS number
FROM temp,
UNNEST(GENERATE_ARRAY(1, pos)) AS num
-- ORDER BY num    

with the same final output

User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement