Skip to content
Advertisement

WHERE IN performs much better with Table Type than with hardcoded values in SQL Server

I have 2 queries that are essentially the same (at least if I didn’t miss something).

DECLARE @siloIds SiloIdsTableType
INSERT INTO @siloIds VALUES 
(1),(2),(3)

-- Query 1
SELECT *
FROM [Transaction]
WHERE 
      SiloId IN (1,2,3)
  AND Time > '2000-02-01'

-- Query 2
SELECT *
FROM [Transaction]
WHERE 
       SiloId IN (select SiloId from @siloIds)
   AND Time > '2000-02-01'

I was thinking that one cannot beat constants declared in the query itself, but apparently the first query is few times slower than the second one. It seems that SQL server is not smart enough to provide a good plan for the hardcoded values, or am I missing something here?

It seems that one shall not use where in with a long list and TVP should be always favored

P.S. I use thousand values instead of 1,2,3 in my query

P.P.S. I have a non-clustered index on SiloId ASC, Time ASC, but it seems that the first query is not using it favoring clustered index scan for some reason.

P.P.P.S. Execution plan shares the cost 14% to 86% in favor of the second query

Execution plan:

enter image description here

Advertisement

Answer

When you use a table variable (or a TVP, which is the same thing), SQL Server uses a fixed estimate that it will only get 1 row out of it (more on this below), a cardinality of 1. This means it assumes that the SiloId join filter is very selective, it will prioritize it and do a nested loop join to get just those rows, filtering afterwards on Time.

Whereas when you use constants, the exact size is hard-coded. For whatever reason (probably bad statistics), it has assumed that Time is more selective, and therefore prioritizes that over the other filter.

Where the table variable plan falls down is when the there are a lot of rows in it, or in the main table, because then you will get lots of key lookups, which can be slow.

Ideally you want the compiler to know the size of the table variable up front. You can do this in a number of ways, as Brent Ozar explains:

  • trace flag 2453, this causes a recompile if the cardinality is very different (good idea if you can risk a TF)
  • OPTION (RECOMPILE) (this recompiles every time, which may be inefficient in itself)
  • a temporary table (not possible as a parameter)
User contributions licensed under: CC BY-SA
10 People found this is helpful
Advertisement