This is similar to How to find missing data rows using SQL? and How to find missing rows (dates) in a mysql table? but a bit more complex, so I’m hitting a wall.
I have a data table with the noted Primary key:
country_id (PK) product_id (PK) history_date (PK) amount
I have a products table with all products, a countries table, and a calendar table with all valid dates.
I’d like to find all countries, dates and products for which there are missing products, with this wrinkle: I only care about dates for which there are entries for a country for at least one product (i.e. if the country has NOTHING on that day, I don’t need to find it) – so, by definition, there is an entry in the history table for every country and date I care about.
I know it’s going to involve some joins maybe a cross join, but I’m hitting a real wall in finding missing data.
I tried this (pretty sure it wouldn’t work):
SELECT h.history_date, h.product_id, h.country_id, h.amount FROM products p LEFT JOIN history h ON (p.product_id = h.product_id) WHERE h.product_id IS NULL
No Joy.
I tried this too:
WITH allData AS (SELECT h1.country_id, p.product_id, h1.history_date FROM products p CROSS JOIN (SELECT DISTINCT country_id, history_date FROM history) h1) SELECT f.history_date, f.product_id, f.country_id FROM allData f LEFT OUTER JOIN history h ON (f.country_id = h.country_id AND f.history_date = h.history_date AND f.product_id = h.product_id) WHERE h.product_id IS NULL AND h.country_id IS NOT NULL AND h.history_date IS NOT null
also no luck. The CTE does get me every product on every date that there is also data, but the rest returns nothing.
Advertisement
Answer
I only care about dates for which there are entries for a country for at least one product (i.e. if the country has NOTHING on that day, I don’t need to find it)
So we care about this combination:
from (select distinct country_id, history_date from history) country_date cross join products p
Then it’s just a matter of checking for existence:
select * from (select distinct country_id, history_date from history) country_date cross join products p where not exists (select null from history h where country_date.country_id = h.country_id and country_date.history_date = h.history_date and p.product_id = h.product_id )