I have a table (Postgres 9.3) defined as follows:
CREATE TABLE tsrs ( id SERIAL PRIMARY KEY, customer_id INTEGER NOT NULL REFERENCES customers, timestamp TIMESTAMP WITHOUT TIME ZONE, licensekeys_checksum VARCHAR(32));
The pertinent details here are the customer_id
, the timestamp
, and the licensekeys_checksum
. There can be multiple entries with the same customer_id
, some of those may have matching licensekey_checksum
entries, and some may be different. There will never be rows with equal checksum and equal timestamps.
I want to return a table containing 1 row for each group of rows with matching licensekeys_checksum
entries. The row returned for each group should be the one with the newest / most recent timestamp.
Sample Input:
1, 2, 2014-08-21 16:03:35, 3FF2561A 2, 2, 2014-08-22 10:00:41, 3FF2561A 2, 2, 2014-06-10 10:00:41, 081AB3CA 3, 5, 2014-02-01 12:03:23, 299AFF90 4, 5, 2013-12-13 08:14:26, 299AFF90 5, 6, 2013-09-09 18:21:53, 49FFA891
Desired Output:
2, 2, 2014-08-22 10:00:41, 3FF2561A 2, 2, 2014-06-10 10:00:41, 081AB3CA 3, 5, 2014-02-01 12:03:23, 299AFF90 5, 6, 2013-09-09 18:21:53, 49FFA891
I have managed to piece together a query based on the comments below, and hours of searching on the internet. 🙂
select * from tsrs inner join ( select licensekeys_checksum, max(timestamp) as mts from tsrs group by licensekeys_checksum ) x on x.licensekeys_checksum = tsrs.licensekeys_checksum and x.mts = tsrs.timestamp;
It seems to work, but I am unsure. Am I on the right track?
Advertisement
Answer
Your query in the question should perform better than the queries in the (previously) accepted answer. Test with EXPLAIN ANALYZE
.
DISTINCT ON
is typically simpler and faster:
SELECT DISTINCT ON (licensekeys_checksum) * FROM tsrs ORDER BY licensekeys_checksum, timestamp DESC NULLS LAST;
Detailed explanation: