Skip to content
Advertisement

DISTINCT and ORDER BY in the same command without using the order by variable

I need to select the recent 5 games the user has gone into but I don’t want there to be duplicates. So I need a SQL line that will take them in order from most recent to least but also make sure there are no distinct. To order them I have a time variable, but if I use:

select distinct name, image, time 
from history 
where userID = USERID 
order by time desc

this removes all the other times apart from the first ones meaning that I will get the result in the order of which game was played first ever.

If you still don’t understand here is the schema and some of the data I have:

Schema:

CREATE TABLE history (userID integer, name text not null, image text not null, time text not null);

Data:

6 | Arcade Mathematicians | arcadeMathematicians | 2021-04-17 07:59:00  
6 | Arcade Mathematicians | arcadeMathematicians | 2021-04-17 08:01:04  
6 | Infinite Spinner | infiniteSpinner | 2021-04-17 08:01:08  
6 | Arcade Mathematicians | arcadeMathematicians | 2021-04-17 08:01:12  
6 | Arcade Mathematicians | arcadeMathematicians | 2021-04-17 08:02:13  
6 | Infinite Spinner | infiniteSpinner | 2021-04-17 08:02:16  
6 | Arcade Mathematicians | arcadeMathematicians | 2021-04-17 08:03:16  
6 | Infinite Spinner | infiniteSpinner | 2021-04-17 08:03:18  
6 | Arcade Mathematicians | arcadeMathematicians | 2021-04-17 08:03:21 

So if you see here if I use distinct on time it won’t work, as everything would still be displayed.

Advertisement

Answer

Use ROW_NUMBER() window function to rank each row of the table based on user and game and ordered by time descending.
From these results keep only the rows with rank = 1, meaning the latest rows for each combination of user and game.
Then use ROW_NUMBER() again to rank the remaining rows based on user and ordered by time descending and from the results keep only the rows with rank <= 5:

WITH 
  cte1 AS (
    SELECT *, ROW_NUMBER() OVER (PARTITION BY userid, name ORDER BY time DESC) rn1
    FROM history
  ),
  cte2 AS (
    SELECT *, ROW_NUMBER() OVER (PARTITION BY userid ORDER BY time DESC) rn2
    FROM cte1
    WHERE rn1 = 1
  )
SELECT userid, name, image, time 
FROM cte2
WHERE rn2 <= 5 

If you want results for a specific user it is easier, just use MAX() window function and sort the results to get the top 5 rows:

SELECT DISTINCT userid, name, image, 
       MAX(time) OVER (PARTITION BY name) time 
FROM history
WHERE userid = 6
ORDER BY time DESC LIMIT 5

Or if the image may be different for the same name form row to row:

SELECT DISTINCT userid, name, 
       FIRST_VALUE(image) OVER (PARTITION BY name ORDER BY time DESC) image, 
       MAX(time) OVER (PARTITION BY name) time 
FROM history
WHERE userid = 6
ORDER BY time DESC LIMIT 5

See the demo.
Results (for your sample data):

userid name image time
6 Arcade Mathematicians arcadeMathematicians 2021-04-17 08:03:21
6 Infinite Spinner infiniteSpinner 2021-04-17 08:03:18
User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement