I’m trying to construct a DB2 query that returns distinct rows, but for the distinction to be based on only one column’s values.
The query below will allow multiple rows with the same order number if, say, itemColor
is different.
select distinct orderNumber, itemColor, orderDate from ORDER_TABLE
returns…
order_number item_color order_date 0001 | red | 2018-06-19 0001 | green | 2018-06-19 // DUPLICATE ORDER_NUMBER 0002 | red | 2018-06-19 0003 | red | 2018-06-19 0004 | red | 2018-06-19
What I’d like is for my query to do something like this:
select orderNumber, itemColor, orderDate from ORDER_TABLE where orderNumber is distinct
That way I can be assured that only one record will be returned for every order number. It seems that DB2 does not allow for distinct column values, but only distinct full rows.
A suggestion I saw said to do a sub query, like so:
select orderNumber, itemColor, orderDate from ORDER_TABLE where orderNumber in (select distinct(orderNumber) from ORDER_TABLE)
… but sadly this didn’t work, I still saw records with the same orderNumber
in my result set.
I see the official docs say to use the group by
operator – but this seems to be for aggregating functions, and I don’t want to aggregate any records! I just want ONE row per orderNumber
. Maybe I’m misunderstanding the docs.
Thanks.
Advertisement
Answer
You can use window functions:
select orderNumber, itemColor, orderDate from (select o.*, row_number() over (partition by orderNumber order by orderNumber) as seqnum from ORDER_TABLE ) o where seqnum = 1;