I’m trying to find a performant and easy-to-read query to get a distinct value from one column, if all rows in the table matches a certain criteria.
I have a table that tracks e-commerce orders and whether they’re delivered on time, contents and schema as following:
> select * from orders; +----+--------------------+-------------+ | id | delivered_on_time | customer_id | +----+--------------------+-------------+ | 1 | 1 | 9 | | 2 | 0 | 9 | | 3 | 1 | 10 | | 4 | 1 | 10 | | 5 | 0 | 11 | +----+--------------------+-------------+
I would like to get all distinct customer_id’s which have had all their orders delivered on time. I.e. I would like an output like this:
+-------------+ | customer_id | +-------------+ | 10 | +-------------+
What’s the best way to do this?
I’ve found a solution, but it’s a bit hard to read and I doubt it’s the most efficient way to do it (using double CTE’s):
> with hits_all as ( select memberid,count(*) as count from orders group by memberid ), hits_true as (select memberid,count(*) as count from orders where hit = true group by memberid) select * from hits_true inner join hits_all on hits_all.memberid = hits_true.memberid and hits_all.count = hits_true.count; +----------+-------+----------+-------+ | memberid | count | memberid | count | +----------+-------+----------+-------+ | 10 | 2 | 10 | 2 | +----------+-------+----------+-------+
Advertisement
Answer
You use group by
and having
as follows:
select customer_id from orders group by customer_id having sum(delivered_on_time) = count(*)
This works because an ontime delivery is identified by delivered_on_time = 1
. So you can just ensure that the sum of delivered_on_time
is equal to the number of records for the customer.