I have a postgresql table with two columns. name VARCHAR(255) and notes JSON. A sample dataset might look like this:
| name | notes |
|-----------|----------------------------------|
| 'anna' | {'link_to': ['bob']} |
| 'bob' | {'link_to': ['anna', 'claudia']} |
| 'claudia' | {'link_to': []} |
Now I want to do two things.
put the list from the json at ‘link_to’ into another column called referrals_to (which must be of array VARCHAR type then). From my example:
name notes referrals_to ‘anna’ {‘link_to’: [‘bob’]} [‘bob’] ‘bob’ {‘link_to’: [‘anna’, ‘claudia’]} [‘anna’, ‘claudia’] ‘claudia’ {‘link_to’: []} [] create another column called referrals_from, where I want to store all names from which a name was referred. In my example:
name notes referrals_from ‘anna’ {‘link_to’: [‘bob’]} [‘bob’] ‘bob’ {‘link_to’: [‘anna’, ‘claudia’]} [‘anna’] ‘claudia’ {‘link_to’: []} [‘bob’]
How do I do this using postgresql queries? I could easily do it using python, but this would be slower than using postgresql directly, I guess.
Advertisement
Answer
The referrals_to
column can be done using the ->
operator to extract the array. For the referrals_from
column I would use a scalar sub-select to collect all referring names into a single column (a JSON array)
select name,
notes,
notes -> 'link_to' as referrals_to,
(select jsonb_agg(name)
from the_table t2
where t2.name <> t1.name
and t2.notes -> 'link_to' ? t1.name) as referrals_from
from the_table t1
;
The ?
operator tests if a JSON array contains a specific string. In this case it tests if the link_to
array contains the name from the outer query.
Assuming name
is unique in your table you can use that query to update the new columns in the table:
update the_table
set referrals_to = notes -> 'link_to',
referrals_from = t.referrals_from
from (
select t1.name,
(select jsonb_agg(name)
from the_table t2
where t2.name <> t1.name
and t2.notes -> 'link_to' ? t1.name) as referrals_from
from the_table t1
) t
where t.name = the_table.name;