I have an application where the user can retrieve a list. The user is allowed to add certain filters. For example:
- Articles: 123, 456, 789
- CustomerGroups: 1, 2, 3, 4, 5
- Customers:
null
- ArticleGroups:
null
- …
When a filter is empty (or null
), the query must ignore that filter.
What is the most performant way to build your query so it can handle a lot (10+) of different filters (and joins)?
My current approach is the following, but it doesn’t scale very well:
CREATE PROCEDURE [dbo].[GetFilteredList] @start datetime, @stop datetime, @ArticleList varchar(max), -- '123,456,789' @ArticleGroupList varchar(max), @CustomerList varchar(max), @CustomerGroupList varchar(max) -- '1,2,3,4,5' --More filters here... AS BEGIN SET NOCOUNT ON DECLARE @Articles TABLE (value VARCHAR(10)); INSERT INTO @Articles (value) SELECT * FROM [dko_db].[dbo].fnSplitString(@ArticleList, ','); DECLARE @ArticleGroups TABLE (value VARCHAR(10)); INSERT INTO @ArticleGroups (value) SELECT * FROM [dko_db].[dbo].fnSplitString(@ArticleGroupList, ','); DECLARE @Customers TABLE (value VARCHAR(10)); INSERT INTO @Customers (value) SELECT * FROM [dko_db].[dbo].fnSplitString(@CustomerList, ','); DECLARE @CustomerGroups TABLE (value VARCHAR(10)); INSERT INTO @CustomerGroups (value) SELECT * FROM [dko_db].[dbo].fnSplitString(@CustomerGroupList, ','); select * -- Some columns here FROM [dbo].[Orders] o LEFT OUTER JOIN [dbo].[Article] a on o.ArticleId = a.Id LEFT OUTER JOIN [dbo].[ArticleGroup] ag on a.GroupId = ag.Id LEFT OUTER JOIN [dbo].[Customer] c on o.CustomerId = o.Id LEFT OUTER JOIN [dbo].[CustomerGroup] cg on c.GroupId = cg.Id -- More joins here WHERE o.OrderDate between @start and @stop and (isnull(@ArticleList, '') = '' or a.ArticleCode in (select value from @Articles)) and (isnull(@ArticleGroupList, '') = '' or ag.GroupCode in (select value from @ArticleGroups)) and (isnull(@CustomerList, '') = '' or c.CustomerCode in (select value from @Customers)) and (isnull(@CustomerGroupList, '') = '' or cg.GroupCode in (select value from @CustomerGroups)) ORDER BY c.Name, o.OrderDate END
Advertisement
Answer
There’s a lot of “low hanging fruit” performance improvements here.
First, lose ORDER BY c.Name, o.OrderDate
that’s just needless sorting.
Second, for your “list” variables (e.g. @ArticleList) – if you don’t need VARCHAR(MAX) then change the data type(s) to VARCHAR(8000). VARCHAR(MAX) is much slower than VARCHAR(8000). I Never use MAX data types unless I am certain it’s required.
Third, you can skip dumping your split values in to Table variables. That’s Just needless overhead. You can lose all those declarations and inserts, then change THIS:
... a.ArticleCode in (select value from @Articles))
TO:
... a.ArticleCode in (SELECT value FROM dbo.fnSplitString(@ArticleList, ',')))
Fourth, fnSplitString is not an inline table valued function (e.g. you see BEGIN and END in the DDL) then it will be slow. An inline splitter will be much faster; consider DelimitedSplit8k or DelimitedSplit8K_LEAD.
Last I would add an OPTION (RECOMPILE) as this is a query highly unlikely to benefit from plan caching. A recompile will force the optimizer to evaluate your parameters ahead of time.
Beyond that, when joining a bunch of tables, check the execution plan, see where most of the data is coming from and use that info to index accordingly.