The part of the schema is as follows:
- Customer (cid, name, city) (primary key: cid)
- Purchase (cid, club, title, year, when, qnty) (primary key: cid, club, title, year, when)
I want to list distinct pairs of customers who have made a purchase of a book (title) in common. For each pair of such customers, show first the one with the larger cid. In the answer table, only report the customers’ names. The output columns’ names are namea
and nameb
.
I already list all pairs with different namea and nameb with the same title
SELECT distinct c1.name as namea, c2.name as nameb FROM purchase p1 JOIN customer AS c1 ON c1.cid = p1.cid JOIN purchase p2 ON p1.title = p2.title and p1.year = p2.year and p1.cid != P2.cid and p1.cid < p2.cid JOIN customer AS c2 ON c2.cid = p2.cid;
In case someone needs the table creation file: create, the expected return number should be 283, but I keep getting 282 since there are 2 people with the same name.
Edited:
There will be two people with the same name, so the output pairs need to be distinct by the cid
, not the name
. The number of my actual output tuples matches the expected output when I SELECT cid
, but it doesn’t work when I SELECT names. Since customers are identified by the cid
, not the name
, some customers with the same name will be filtered out if we distinct names. And this is the problem I’m working on right now.
Advertisement
Answer
Add the cid
s to the select list of your query and then select from it only the names:
SELECT t.namea, t.nameb FROM ( SELECT DISTINCT c1.cid ida, c1.name as namea, c2.cid idb, c2.name as nameb FROM purchase p1 JOIN customer AS c1 ON c1.cid = p1.cid JOIN purchase p2 ON p1.title = p2.title and p1.year = p2.year and p1.cid != p2.cid and p1.cid < p2.cid JOIN customer AS c2 ON c2.cid = p2.cid ) t
Or you can group by c1.cid, c1.name, c2.cid, c2.name
:
SELECT c1.name as namea, c2.name as nameb FROM purchase p1 JOIN customer AS c1 ON c1.cid = p1.cid JOIN purchase p2 ON p1.title = p2.title and p1.year = p2.year and p1.cid != p2.cid and p1.cid < p2.cid JOIN customer AS c2 ON c2.cid = p2.cid GROUP BY c1.cid, c1.name, c2.cid, c2.name
See the demo.
Result 283 rows.