Skip to content
Advertisement

How to pull rows from a SQL table until quotas for multiple columns are met?

I’ve been able to find a few examples of questions similar to this one, but most only involve a single column being checked.

SQL Select until Quantity Met

Select rows until condition met

I have a large table representing facilities, with columns for each type of resource available and the number of those specific resources available per facility. I want this stored procedure to be able to take integer values in as multiple parameters (representing each of these columns) and a Lat/Lon. Then it should iterate over the table sorted by distance, and return all rows (facilities) until the required quantity of available resources (specified by the parameters) are met.

Data source example:

Id Lat Long Resource1 Resource2
1 50.123 4.23 5 12
2 61.234 5.34 0 9
3 50.634 4.67 21 18

Result Wanted:

@latQuery = 50.634

@LongQuery = 4.67

@res1Query = 10

@res2Query = 20

Id Lat Long Resource1 Resource2
3 50.634 4.67 21 18
1 50.123 4.23 5 12

Result includes all rows that meet the queries individually. Result is also sorted by distance to the requested lat/lon


I’m able to sort the results by distance, and sum the total running values as suggested in other threads, but I’m having some trouble with the logic comparing the running values with the quota provided in the params.

First I have some CTEs to get most recent edits, order by distance and then sum the running totals

WITH cte1 AS (SELECT
  @origin.STDistance(geography::Point(Facility.Lat, Facility.Long, 4326)) AS distance,
  Facility.Resource1 as res1,
  Facility.Resource2 as res2
  -- ...etc
  FROM Facility
),
cte2 AS (SELECT 
  distance, 
  res1, 
  SUM(res1) OVER (ORDER BY distance) AS totRes1,
  res2, 
  SUM(res1) OVER (ORDER BY distance) AS totRes2
  -- ...etc, there's 15-20 columns here
  FROM cte1
)

Next, with the results of that CTE, I need to pull rows until all quotas are met. Having the issues here, where it works for one row but my logic with all the ANDs isn’t exactly right.

SELECT * FROM cte2 WHERE (
  (totRes1 <= @res1Query OR (totRes1 > @res1Query AND totRes1- res1 <= @totRes1)) AND
  (totRes2 <= @res2Query OR (totRes2 > @res2Query AND totRes2- res2 <= @totRes2)) AND
  -- ... I also feel like this method of pulling the next row once it's over may be convoluted as well?
)

As-is right now, it’s mostly returning nothing, and I’m guessing it’s because it’s too strict? Essentially, I want to be able to let the total values go past the required values until they are all past the required values, and then return that list.

Has anyone come across a better method of searching using separate quotas for multiple columns?


See my update in the answers/comments

Advertisement

Answer

I think you are massively over-complicating this. This does not need any joins, just some running sum calculations, and the right OR logic.

The key to solving this is that you need all rows, where the running sum up to the previous row is less than the requirement for all requirements. This means that you include all rows where the requirement has not been met, and the first row for which the requirement has been met or exceeded.

To do this you can subtract the current row’s value from the running sum.

You could utilize a ROWS specification of ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING. But then you need to deal with NULL on the first row.

In any event, even a regular running sum should always use ROWS UNBOUNDED PRECEDING, because the default is RANGE UNBOUNDED PRECEDING, which is subtly different and can cause incorrect results, as well as being slower.

You can also factor out the distance calculation into a CROSS APPLY (VALUES, avoiding the need for lots of CTEs or derived tables. You now only need one level of derivation.

DECLARE @origin geography = geography::Point(@latQuery, @LongQuery, 4326);

SELECT
  f.Id,
  f.Lat,
  f.Long,
  f.Resource1,
  f.Resource2
FROM (
    SELECT f.*,
        SumRes1 = SUM(f.Resource1) OVER (ORDER BY v1.Distance ROWS UNBOUNDED PRECEDING) - f.Resource1,
        SumRes2 = SUM(f.Resource2) OVER (ORDER BY v1.Distance ROWS UNBOUNDED PRECEDING) - f.Resource2
    FROM Facility f
    CROSS APPLY (VALUES(
        @origin.STDistance(geography::Point(f.Lat, f.Long, 4326))
    )) v1(Distance)
) f
WHERE (
      f.SumRes1 < @res1Query
   OR f.SumRes2 < @res2Query
);

db<>fiddle

User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement