I have following tables:
Orders order_id 9 10 11 Order_details order_id, product_id 9, 7 10, 5 10, 6 11, 6 11, 7 Products product_id, product_name, price 5, potato, 4.99 6, potato *, 7.5 7, orange, 7.99
I already received feedback how to find orders where product names are duplicated, but now situation is a little more complicated as it turned out duplicated positions have additional sign ‘ *’ after it’s name as showed above.
How to add to this query possibility to count ONLY orders where there is one product without additional characters and other with it?
for example ‘potato’ and ‘potato’ would be ignored, ‘potato *’ and ‘potato *’ would be also ignored but order with ‘potato’ and ‘potato *’ would be in the results
select od.order_id from order_details od join products p on od.product_id = p.product_id group by od.order_id having count(p.product_name) > count(distinct p.product_name)
Advertisement
Answer
One option might be to just do a simple replacement to remove *
from the product name:
SELECT od.order_id FROM order_details od INNER JOIN products p ON od.product_id = p.product_id GROUP BY od.order_id HAVING COUNT(DISTINCT p.product_name) <> COUNT(DISTINCT REPLACE(p.product_name, ' *', ''));
The demo is given for MySQL, but the same query should run on at least several other databases.
Ideally, it would be better to do a regex replacement on the product name, which would avoid the possibility that space followed by *
happens to appear elsewhere as a legitimate part of the product name.
Edit:
Since you are using Postgres we can in fact do a more targeted regex replacement:
SELECT od.order_id FROM order_details od INNER JOIN products p ON od.product_id = p.product_id GROUP BY od.order_id HAVING COUNT(DISTINCT p.product_name) <> COUNT(DISTINCT REGEXP_REPLACE(p.product_name, ' *$', ''));