Skip to content
Advertisement

Can SQL return different results for two runs of the same query using ORDER BY?

I have the following table:

CREATE TABLE dbo.TestSort
(
Id int NOT NULL IDENTITY (1, 1),
Value int NOT NULL
) 

The Value column could (and is expected to) contain duplicates.
Let’s also assume there are already 1000 rows in the table.

I am trying to prove a point about unstable sorting.

Given this query that returns a ‘page’ of 10 results from the first 1000 inserted results:

SELECT TOP 10 * FROM TestSort WHERE Id <= 1000 ORDER BY Value

My intuition tells me that two runs of this query could return different rows if the Value column contains repeated values.

I’m basing this on the facts that:

  • the sort is not stable
  • if new rows are inserted in the table between the two runs of the query, it could possibly create a re-balancing of B-trees (the Value column may be indexed or not)

EDIT: For completeness: I assume rows never change once inserted, and are never deleted.

In contrast, a query with stable sort (ordering also by Id) should always return the same results, since IDs are unique:

SELECT TOP 10 * FROM TestSort WHERE Id <= 1000 ORDER BY Value, Id

The question is: Is my intuition correct? If yes, can you provide an actual example of operations that would produce different results (at least “on your machine”)? You could modify the query, add indexes on the Values column etc.
I don’t care about the exact query, but about the principle.

I am using MS SQL Server (2014), but am equally satisfied with answers for any SQL database.

If not, then why?

Advertisement

Answer

If you have ties when ordering the order by is not stable.

LiveDemo

CREATE TABLE #TestSort
(
Id INT NOT NULL IDENTITY (1, 1) PRIMARY KEY,
Value INT NOT NULL
) ;

DECLARE @c INT = 0;

WHILE @c < 100000
BEGIN
  INSERT INTO #TestSort(Value)
  VALUES ('2');

  SET @c += 1;
END

Example:

SELECT TOP 10 * 
FROM #TestSort 
ORDER BY Value
OPTION (MAXDOP 4);

DBCC DROPCLEANBUFFERS;  -- run to clear cache

SELECT TOP 10 * 
FROM #TestSort 
ORDER BY Value
OPTION (MAXDOP 4);

The point is I force query optimizer to use parallel plan so there is no guaranteed that it will read data sequentially like Clustered index probably will do when no parallelism is involved.

You cannot be sure how Query Optimizer will read data unless you explicitly force to sort result in specific way using ORDER BY Id, Value.

For more info read No Seatbelt - Expecting Order without ORDER BY.

User contributions licensed under: CC BY-SA
1 People found this is helpful
Advertisement