I have following tables:
Orders
order_id
9
10
11
Order_details
order_id, product_id
9, 7
10, 5
10, 6
11, 6
11, 7
Products
product_id, product_name, price
5, potato, 4.99
6, potato *, 7.5
7, orange, 7.99
I already received feedback how to find orders where product names are duplicated, but now situation is a little more complicated as it turned out duplicated positions have additional sign ‘ *’ after it’s name as showed above.
How to add to this query possibility to count ONLY orders where there is one product without additional characters and other with it?
for example ‘potato’ and ‘potato’ would be ignored, ‘potato *’ and ‘potato *’ would be also ignored but order with ‘potato’ and ‘potato *’ would be in the results
select od.order_id
from order_details od join
products p
on od.product_id = p.product_id
group by od.order_id
having count(p.product_name) > count(distinct p.product_name)
Advertisement
Answer
One option might be to just do a simple replacement to remove *
from the product name:
SELECT
od.order_id
FROM order_details od
INNER JOIN products p
ON od.product_id = p.product_id
GROUP BY
od.order_id
HAVING
COUNT(DISTINCT p.product_name) <>
COUNT(DISTINCT REPLACE(p.product_name, ' *', ''));
Demo
The demo is given for MySQL, but the same query should run on at least several other databases.
Ideally, it would be better to do a regex replacement on the product name, which would avoid the possibility that space followed by *
happens to appear elsewhere as a legitimate part of the product name.
Edit:
Since you are using Postgres we can in fact do a more targeted regex replacement:
SELECT
od.order_id
FROM order_details od
INNER JOIN products p
ON od.product_id = p.product_id
GROUP BY
od.order_id
HAVING
COUNT(DISTINCT p.product_name) <>
COUNT(DISTINCT REGEXP_REPLACE(p.product_name, ' *$', ''));