I have a table look like this:
x
ID B C date01 date02 date03
11 xxx xxxx 2020-05-01 2020-05-02 2020-06-02
11 xxx xxxx 2020-06-01 2020-06-03 2020-05-02
11 xxx xxxx 2020-07-01 2020-07-03 2020-06-30
11 xxx xxxx 2020-07-01 2020-06-03 2020-06-30
11 xxx xxxx 2020-01-01 2020-01-08 2020-05-02
11 xxx xxxx 2020-02-01 2020-01-31 2020-05-02
22 xxx xxxx 2020-05-01 2020-05-02 2020-06-02
22 xxx xxxx 2020-06-01 2020-06-03 2020-05-02
22 xxx xxxx 2020-07-01 2020-07-03 2020-06-30
22 xxx xxxx 2020-07-01 2020-06-03 2020-06-30
22 xxx xxxx 2020-01-01 2020-01-08 2020-05-02
22 xxx xxxx 2020-02-01 2020-01-31 2020-05-02
I want to return everything but with a latest date of those three dates for each ID, and date02 cannot be later than date03, my current output will give me this where date02 > date03:
11 xxx xxxx 2020-07-01 2020-07-03 2020-06-30
Expected output:
11 xxx xxxx 2020-07-01 2020-06-03 2020-06-30
22 xxx xxxx 2020-07-01 2020-06-03 2020-06-30
I tried this:
SELECT
id,
B,
C,
max(date01),
max(date02),
max(date03),
FROM
table
WHERE
'date02' < 'date03'
GROUP BY id
I’ve added WHERE 'date02' < 'date03'
but why the output still have the records where date02>date03?? I’m very new to SQL, please help…
Advertisement
Answer
You can do this with a correlated subquery and tuple equality:
select t.*
from mytable t
where (t.date01, t.date02, t.date03) = (
select t1.date01, t1.date02, t1.date03
from mytable t1
where t1.id = t.id
order by t1.date01 desc, t1.date02 desc, t1.date03 desc
limit 1
)
For performance with this query, you can create a compound index on (id, date01, date02, date03)
.
You can also use row_number()
, if you are running MySQL 8.0:
select *
from (
select
t.*,
row_number() over(partition by id order by date01 desc, date02 desc, date03 desc) rn
from mytable t
) t
where rn = 1
ID | B | C | date01 | date02 | date03 -: | :-- | :--- | :--------- | :--------- | :--------- 11 | xxx | xxxx | 2020-07-01 | 2020-07-03 | 2020-06-30 22 | xxx | xxxx | 2020-07-01 | 2020-07-03 | 2020-06-30