I have a postgresql table with two columns. name VARCHAR(255) and notes JSON. A sample dataset might look like this:
| name | notes | |-----------|----------------------------------| | 'anna' | {'link_to': ['bob']} | | 'bob' | {'link_to': ['anna', 'claudia']} | | 'claudia' | {'link_to': []} |
Now I want to do two things.
put the list from the json at ‘link_to’ into another column called referrals_to (which must be of array VARCHAR type then). From my example:
name notes referrals_to ‘anna’ {‘link_to’: [‘bob’]} [‘bob’] ‘bob’ {‘link_to’: [‘anna’, ‘claudia’]} [‘anna’, ‘claudia’] ‘claudia’ {‘link_to’: []} [] create another column called referrals_from, where I want to store all names from which a name was referred. In my example:
name notes referrals_from ‘anna’ {‘link_to’: [‘bob’]} [‘bob’] ‘bob’ {‘link_to’: [‘anna’, ‘claudia’]} [‘anna’] ‘claudia’ {‘link_to’: []} [‘bob’]
How do I do this using postgresql queries? I could easily do it using python, but this would be slower than using postgresql directly, I guess.
Advertisement
Answer
The referrals_to
column can be done using the ->
operator to extract the array. For the referrals_from
column I would use a scalar sub-select to collect all referring names into a single column (a JSON array)
select name, notes, notes -> 'link_to' as referrals_to, (select jsonb_agg(name) from the_table t2 where t2.name <> t1.name and t2.notes -> 'link_to' ? t1.name) as referrals_from from the_table t1 ;
The ?
operator tests if a JSON array contains a specific string. In this case it tests if the link_to
array contains the name from the outer query.
Assuming name
is unique in your table you can use that query to update the new columns in the table:
update the_table set referrals_to = notes -> 'link_to', referrals_from = t.referrals_from from ( select t1.name, (select jsonb_agg(name) from the_table t2 where t2.name <> t1.name and t2.notes -> 'link_to' ? t1.name) as referrals_from from the_table t1 ) t where t.name = the_table.name;