As for MySQL, I know how to retrive the last record in each group, such as:
SELECT A.* FROM ( SELECT * FROM devops.clearquest_clearquest ORDER BY 'created_at' DESC ) AS A GROUP BY A.submitter;
But DB2 is a little bit different, you can only select column which is used after GROUP BY, like this:
SELECT A.submitter FROM ( SELECT * FROM devops.clearquest_clearquest ORDER BY 'created_at' DESC ) AS A GROUP BY A.submitter;
So, if I want to fetch two columns, I need to GROUP BY them like this:
SELECT A.column01, A.column02 FROM ( SELECT * FROM devops.clearquest_clearquest ORDER BY 'created_at' DESC ) AS A GROUP BY A.column01, A.column02;
here comes the problem, I can’t retrieve the last record in each group if I GROUP BY two columns. I’m new to DB2, please help.
Advertisement
Answer
This is not proper SQL:
SELECT cc.* FROM (SELECT * FROM devops.clearquest_clearquest ORDER BY created_at DESC ) cc GROUP BY cc.submitter;
Note: No quotes around created_at
.
Because you have columns in the SELECT
that are not GROUP BY
keys and not the arguments to aggregation functions. This will return a syntax error in almost all databases. Happily, it will also break in the more recent versions of MySQL (with the default options).
Often the most efficient equivalent is:
select cc.* from devops.clearquest_clearquest cc where cc.created_at = (select max(cc2.created_at) from devops.clearquest_clearquest cc2 where cc2.submitter = cc.submitter );
With an index on devops.clearquest_clearquest(submitter, created_at)
, this is often the fastest approach in any database.
Other answers have provided the answer using row_number()
. That is also a very good solution, but often a wee bit slower, because the row number is assigned to all rows before most are filtered out.