After using the following code in my database, I got the following results:
select USERID, array_agg(struct(ORDER_TIME, DELIVERY_TIME, PLATFORM) order by ORDER_TIME) STATS from `project.dataset.table` group by USERID having count(1) > 1 order by USERID
USERID | STATS.ORDER_TIME | STATS.DELIVERY_TIME | STATS.PLATFORM |
---|---|---|---|
011 | 2021-01-09 11:14:18 | 2021-01-09 11:44:01 | mobile |
2021-02-12 16:15:51 | 2021-02-12 17:16:51 | desktop | |
2021-03-30 17:23:45 | 2021-02-12 17:16:51 | desktop | |
033 | 2021-01-01 12:30:14 | 2021-01-01 13:30:00 | mobile |
2021-04-16 23:00:45 | 2021-04-16 23:45:40 | mobile | |
040 | 2021-02-18 19:22:55 | 2021-02-18 20:00:05 | mobile |
2021-05-06 09:12:13 | 2021-05-06 10:00:10 | desktop |
However, I only need those registers which contains both mobile AND desktop. So I need something like this, a result where there is no data for the USERID 040, because they only ordered on a mobile phone:
USERID | STATS.ORDER_TIME | STATS.DELIVERY_TIME | STATS.PLATFORM |
---|---|---|---|
011 | 2021-01-09 11:14:18 | 2021-01-09 11:44:01 | mobile |
2021-02-12 16:15:51 | 2021-02-12 17:16:51 | desktop | |
2021-03-30 17:23:45 | 2021-02-12 17:16:51 | desktop | |
033 | 2021-01-01 12:30:14 | 2021-01-01 13:30:00 | mobile |
2021-05-06 09:12:13 | 2021-05-06 10:00:10 | desktop |
How can I possibly do that? Thank you very much!
Advertisement
Answer
The “simplest” way is to add few more conditions into having
clause
select USERID, array_agg(struct(ORDER_TIME, DELIVERY_TIME, PLATFORM) order by ORDER_TIME) STATS from `project.dataset.table` group by USERID having count(1) > 1 and 'mobile' in unnest(array_agg(PLATFORM)) and 'desktop' in unnest(array_agg(PLATFORM)) order by USERID
if applied to sample data in your question – output is
In cases when you have more such entries to compare with – you can use below version to avoid repeating similar line of code
select USERID, array_agg(struct(ORDER_TIME, DELIVERY_TIME, PLATFORM) order by ORDER_TIME) STATS from `project.dataset.table` group by USERID having count(1) > 1 and array_length(array_agg(distinct if(PLATFORM in ('mobile', 'desktop'), PLATFORM, null))) = 2 order by USERID