How to optimize a simple query that search on one table for IDes that are not a part of a set.
I created the following query
Select userId from user where userId not in (5000, 5001, 5002, 5003, more....)
Be aware that the list includes bit more than 35000 rows. I get the following database error
Msg 8623, Level 16, State 1, Line 1
The query processor ran out of internal resources and could not produce a query plan. This is a rare event and only expected for extremely complex queries or queries that reference a very large number of tables or partitions.
Some have suggested optimizing the query by using left join, but I just searching in one table so what is the alternative?
Advertisement
Answer
This is a documented behavior:
Explicitly including an extremely large number of values (many thousands of values separated by commas) within the parentheses, in an
IN
clause can consume resources and return errors 8623 or 8632. To work around this problem, store the items in theIN
list in a table, and use aSELECT
subquery within anIN
clause.
35000 clearly qualifies as many thousands. So, as per the documentation, you should create a table (or a temp table) to store your values, and then left join
it as follows:
select u.userId from user u left join mytemptable t on t.userId = u.userId where t.userId is null
You could also use not exists
:
select u.userId from user u where not exists (select 1 from mytemptable t where t.userId = u.userId)
As a bonus, note that using one of the above techniques fixes a null-safety issue with your original query (as it is, if any of the values in the IN
list is NULL
, the NOT IN
condition will be considered fulfilled, regardless of the value of userId
).