I have a table that contains:
id task_ids 1 10,15 2 NULL 3 17
I have the table that has the names of this tasks:
id task_name 10 a 15 b 17 c
I want to generate the following output
id task_ids task_names 1 10,15 a,b 2 null null 3 17 c
I know this structure isn’t ideal but this is legacy table which I will not change now.
Is there easy way to get the output ? I’m using Presto but I think this can be solved with native sql
Advertisement
Answer
WITH data AS (
SELECT * FROM (VALUES (1, '10,15'), (2, NULL)) x(id, task_ids)
),
task AS (
SELECT * FROM (VALUES ('10', 'a'), ('15', 'b')) x(id, task_name)
)
SELECT
d.id, d.task_ids
-- array_agg will obviously capture NULL task_name comping from LEFT JOIN, so we need to filter out such results
IF(array_agg(t.task_name) IS NOT DISTINCT FROM ARRAY[NULL], NULL, array_agg(t.task_name)) task_names
FROM data d
-- split task_ids by `,`, convert into numbers, UNNEST into separate rows
LEFT JOIN UNNEST (split(d.task_ids, ',')) AS e(task_id) ON true
-- LEFT JOIN with task to pull the task name
LEFT JOIN task t ON e.task_id = t.id
-- aggregate back
GROUP BY d.id, d.task_ids;