I have a table that has 3 main categories: age, city and gender.
The gender column is boolean.
The age category is in the following range:
['18-25','26-35','36-45','46-55','56-65','66-75','75+']
The city category has around 19 cities. So, for each city in category '18-25' and '75+' I would like to select 24 random rows for females and 20 for males.
For the rest of the cities, I would like to select 40 random males and 47 random females.
Is it possible to achieve this without having to run a specific query for each case?
I’m sorry the lack of code, but honestly I don’t even know where or how to start 🙁
Advertisement
Answer
You can use row_number() to enumerate the rows randomly and then just filter:
select t.*
from (select t.*,
row_number() over (partition by city, age, gender order by rand()) as seqnum
from t
) t
where (age in ('18-25', '75+') and
(gender = 'female' and seqnum <= 24 or
gender = 'male' and seqnum <= 20
)
) or
(age not in ('18-25', '75+') and
(gender = 'female' and seqnum <= 47 or
gender = 'male' and seqnum <= 40
)
)