Skip to content
Advertisement

MySQL query to get “intersection” of numerous queries with limits

Assume I have a single mySQL table (users) with the following fields:

I want to be able to return the number of total records based on the number a user enters. Furthermore, they will also be providing additional criteria.

In the simplest example, they may ask for 1,000 records, where 600 records should have gender = ‘Male’ and 400 records where gender = ‘Female’. That’s simple enough to do.

Now, go one step further. Assume they now want to specify Region:

Again, only 1000 records should be returned, but in the end, there must be 600 males, 400 females, 100 Northerners, 200 Southerners, 300 Easterners and 400 Westerners.

I know this isn’t valid syntax, but using pseudo-mySQL code, it hopefully illustrates what I’m trying to do:

Note that I’m not looking for a one-time query. The total number of records and the number of records within each criteria will constantly be changing based on input by the user. So, I’m trying to come up with a generic solution that can be re-used over and over, not a hard-coded solution.

To make things more complicated, now add more criteria. There could also be age, ethnicity and income each with their own set number of records for each group, additional code appended to above:

I’m not sure if this is possible to write in one query or if this requires multiple statements and iterations.

Advertisement

Answer

Flatten Your Criteria


You can flatten your multi-dimensional criteria into a single level criteria

enter image description here

Now this criteria can be achieved in one query as follow

Problem

  • It does not always return the correct result. For example, if there are less than 40 users whose are male and from north, then the query will return less than 1,000 records.

Adjust Your Criteria


Let say that there is less than 40 users whose are male and from north. Then, you need to adjust other criteria quantity to cover the missing quantity from “Male” and “North”. I believe it is not possible to do it with bare SQL. This is pseudo code that I have in mind. For sake of simplification, I think we will only query for Male, Female, North, and South

Let say that there are only 30 northener male. So we need to adjust +10 male, and +10 northener.

‘Male’ + ‘South’ is the first condition that match the ‘Male’ adjustment condition. Increase it by +10, and remove it from the “remain condition” list. Since, we increase the South, we need to decrease it back at other condition. So add “South” condition into “To Adjust” list

Find condition that match the ‘South’ and repeat the same process.

And finally

I haven’t come up with the exact implementation of adjustment yet. It is more difficult than I have expected. I will update once I can figure out how to implement it.

User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement