As for MySQL, I know how to retrive the last record in each group, such as:
SELECT A.*
FROM (
SELECT *
FROM devops.clearquest_clearquest
ORDER BY 'created_at' DESC
) AS A
GROUP BY A.submitter;
But DB2 is a little bit different, you can only select column which is used after GROUP BY, like this:
SELECT A.submitter
FROM (
SELECT *
FROM devops.clearquest_clearquest
ORDER BY 'created_at' DESC
) AS A
GROUP BY A.submitter;
So, if I want to fetch two columns, I need to GROUP BY them like this:
SELECT A.column01, A.column02
FROM (
SELECT *
FROM devops.clearquest_clearquest
ORDER BY 'created_at' DESC
) AS A
GROUP BY A.column01, A.column02;
here comes the problem, I can’t retrieve the last record in each group if I GROUP BY two columns. I’m new to DB2, please help.
Advertisement
Answer
This is not proper SQL:
SELECT cc.*
FROM (SELECT *
FROM devops.clearquest_clearquest
ORDER BY created_at DESC
) cc
GROUP BY cc.submitter;
Note: No quotes around created_at
.
Because you have columns in the SELECT
that are not GROUP BY
keys and not the arguments to aggregation functions. This will return a syntax error in almost all databases. Happily, it will also break in the more recent versions of MySQL (with the default options).
Often the most efficient equivalent is:
select cc.*
from devops.clearquest_clearquest cc
where cc.created_at = (select max(cc2.created_at)
from devops.clearquest_clearquest cc2
where cc2.submitter = cc.submitter
);
With an index on devops.clearquest_clearquest(submitter, created_at)
, this is often the fastest approach in any database.
Other answers have provided the answer using row_number()
. That is also a very good solution, but often a wee bit slower, because the row number is assigned to all rows before most are filtered out.