Skip to content
Advertisement

Most performant way to filter on multiple values in multiple columns?

I have an application where the user can retrieve a list. The user is allowed to add certain filters. For example:

  • Articles: 123, 456, 789
  • CustomerGroups: 1, 2, 3, 4, 5
  • Customers: null
  • ArticleGroups: null

When a filter is empty (or null), the query must ignore that filter.

What is the most performant way to build your query so it can handle a lot (10+) of different filters (and joins)?


My current approach is the following, but it doesn’t scale very well:

CREATE PROCEDURE [dbo].[GetFilteredList] 
    @start datetime,
    @stop datetime,
    @ArticleList varchar(max), -- '123,456,789'
    @ArticleGroupList varchar(max),
    @CustomerList varchar(max),
    @CustomerGroupList varchar(max) -- '1,2,3,4,5'
    --More filters here...
AS 
    BEGIN
    SET NOCOUNT ON

    DECLARE @Articles TABLE (value VARCHAR(10)); 
    INSERT INTO @Articles (value) 
    SELECT * 
    FROM   [dko_db].[dbo].fnSplitString(@ArticleList, ','); 

    DECLARE @ArticleGroups TABLE (value VARCHAR(10)); 
    INSERT INTO @ArticleGroups (value) 
    SELECT * 
    FROM   [dko_db].[dbo].fnSplitString(@ArticleGroupList, ','); 

    DECLARE @Customers TABLE (value VARCHAR(10)); 
    INSERT INTO @Customers (value) 
    SELECT * 
    FROM   [dko_db].[dbo].fnSplitString(@CustomerList, ','); 

    DECLARE @CustomerGroups TABLE (value VARCHAR(10)); 
    INSERT INTO @CustomerGroups (value) 
    SELECT * 
    FROM   [dko_db].[dbo].fnSplitString(@CustomerGroupList, ',');

    select * -- Some columns here
    FROM [dbo].[Orders] o
    LEFT OUTER JOIN [dbo].[Article] a on o.ArticleId = a.Id
    LEFT OUTER JOIN [dbo].[ArticleGroup] ag on a.GroupId = ag.Id
    LEFT OUTER JOIN [dbo].[Customer] c on o.CustomerId = o.Id
    LEFT OUTER JOIN [dbo].[CustomerGroup] cg on c.GroupId = cg.Id
    -- More joins here
    WHERE o.OrderDate between @start and @stop and
        (isnull(@ArticleList, '') = '' or       a.ArticleCode   in (select value from @Articles)) and
        (isnull(@ArticleGroupList, '') = '' or  ag.GroupCode    in (select value from @ArticleGroups)) and
        (isnull(@CustomerList, '') = '' or      c.CustomerCode  in (select value from @Customers)) and
        (isnull(@CustomerGroupList, '') = '' or cg.GroupCode    in (select value from @CustomerGroups))
    ORDER BY c.Name, o.OrderDate

END 

Advertisement

Answer

There’s a lot of “low hanging fruit” performance improvements here.

First, lose ORDER BY c.Name, o.OrderDate that’s just needless sorting.

Second, for your “list” variables (e.g. @ArticleList) – if you don’t need VARCHAR(MAX) then change the data type(s) to VARCHAR(8000). VARCHAR(MAX) is much slower than VARCHAR(8000). I Never use MAX data types unless I am certain it’s required.

Third, you can skip dumping your split values in to Table variables. That’s Just needless overhead. You can lose all those declarations and inserts, then change THIS:

... a.ArticleCode   in (select value from @Articles))

TO:

... a.ArticleCode   in (SELECT value FROM   dbo.fnSplitString(@ArticleList, ',')))

Fourth, fnSplitString is not an inline table valued function (e.g. you see BEGIN and END in the DDL) then it will be slow. An inline splitter will be much faster; consider DelimitedSplit8k or DelimitedSplit8K_LEAD.

Last I would add an OPTION (RECOMPILE) as this is a query highly unlikely to benefit from plan caching. A recompile will force the optimizer to evaluate your parameters ahead of time.

Beyond that, when joining a bunch of tables, check the execution plan, see where most of the data is coming from and use that info to index accordingly.

User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement