I have a table like as shown below
I would like to create two new binary columns indicating whether the subject had steroids and aspirin. I am looking to implement this in Postgresql and google bigquery
I tried the below but it doesn’t work
select subject_id
case when lower(drug) like ('%cortisol%','%cortisone%','%dexamethasone%') 
then 1 else 0 end as steroids,
case when lower(drug) like ('%peptide%','%paracetamol%') 
then 1 else 0 end as aspirin,
from db.Team01.Table_1
SELECT 
db.Team01.Table_1.drug
FROM `table_1`,
UNNEST(table_1.drug) drug
WHERE REGEXP_CONTAINS( db.Team01.Table_1.drug,r'%cortisol%','%cortisone%','%dexamethasone%')
I expect my output to be like as shown below
Advertisement
Answer
Below is for BigQuery Standard SQL
#standardSQL SELECT subject_id, SUM(CASE WHEN REGEXP_CONTAINS(LOWER(drug), r'cortisol|cortisone|dexamethasone') THEN 1 ELSE 0 END) AS steroids, SUM(CASE WHEN REGEXP_CONTAINS(LOWER(drug), r'peptide|paracetamol') THEN 1 ELSE 0 END) AS aspirin FROM `db.Team01.Table_1` GROUP BY subject_id
if to apply to sample data from your question – result is
Row subject_id steroids aspirin 1 1 3 1 2 2 1 1
Note: instead of simple LIKE ending with lengthy and redundant text – I am using LIKE on steroids – which is REGEXP_CONTAINS   

