Are Postgres SELECT DISTINCT
queries deterministic?
Will SELECT DISTINCT somecolumn FROM sometable
return the same result (including order) if the table (and entire database) goes unchanged?
In the Select Query Documentation the Description section notes:
If the ORDER BY clause is specified, the returned rows are sorted in the specified order. If ORDER BY is not given, the rows are returned in whatever order the system finds fastest to produce.
In the DISTINCT ON
clause section they add:
Note that the “first row” of each set is unpredictable unless ORDER BY is used to ensure that the desired row appears first.
Generally, is this still true when the database goes un-changed?
Advertisement
Answer
This answer assumes that the expressions in the select
are deterministic. Otherwise, the question seems trivial.
The ordering is not specified, so it could change between runs of the query — or on a different system. However, the result set should be the same.
Your second quote from the documentation is for distinct on
. That is not-deterministic, unless you are using a stable sort.
Note: You might get non-deterministic results if you are using a case-insensitive collation. The built-in collations are case-sensitive; and case insensitivity means that the original expressions are not deterministic.