I’m struggling on how can I fix, for example, 5 random rows of my table in SQL. But I need to be able to repeatable retrieve the same 5 rows every time I run the code with a given seed.
I tried to use
SELECT TOP 5 * FROM mytable ORDER BY NEWID()
But I didn’t get the same sample each time that I run the code.
Any idea on how can I do that?
Advertisement
Answer
The RAND function can have a seed as parameter.
But the resulting “random” float is predictable.
F.e.
select ID from table1 order by rand(ID)
would give the same order as
select ID from table1 order by ID
So the seeds need to be spiced up.
For example by a calculating a hash.
declare @seed int = 42; select top 5 * from your_table order by rand(HashBytes('MD5', concat(col1, @seed))); GOid | col1 --: | ---: 164 | 64 200 | 100 188 | 88 150 | 50 106 | 6
declare @seed int = 42; select top 5 * from your_table order by rand(HashBytes('MD5', concat(col1, @seed))); GOid | col1 --: | ---: 164 | 64 200 | 100 188 | 88 150 | 50 106 | 6
declare @seed int = 43; select top 5 * from your_table order by rand(HashBytes('MD5', concat(col1, @seed))); GOid | col1 --: | ---: 111 | 11 112 | 12 117 | 17 180 | 80 150 | 50
db<>fiddle here