Skip to content
Advertisement

SQL WHERE IN () Performance Optimization

I checked several questions for a duplicate but I couldn’t find one. I am dealing with three tables, the first “articles”, the second “tags”, and the third “article_tags” which contains two foreign keys “articleid” and “tagid”. The table “article_tags” relates articles sharing the same tag together.

My SQL query is as follows:

I feel that the above query is taking time to load especially with the “article_tags” table growing bigger. I am using SQL Express Edition and the tables are indexed and I would like to know if this can be done in a better way to enhance performance.

Attached is the Execution Plan:

enter image description here

Running the below query as suggested by @SuperPoney, returned the same results, and below is the execution plan:

enter image description here

Advertisement

Answer

You can get everything you need in a single query:

I have assumed that you were originally using TOP 4 for tags as an arbitrary limit for performance reasons, as there was no sort. So have ommitted this. I have also changed your predicate from:

to

The meaning is the same, however by calling the DATEADD/DATEDIFF functions on the run time constants SYSDATETIME() and UTCDATETIME() it means this calculation is only done once, rather than once for every a.publish_date meaning any index on publish_date is now usable.

The other change I have made is to use EXISTS rather than JOIN to link articles to tags. This will avoid duplicates, however it would be equally trivial to remove duplicates using GROUP BY e.g.

A few side notes as well that don’t directly relate to the above answer, but are still worth mentioning.

  1. The Implicit join syntax you are using was replaced 28 years ago by ANSI 92 explicit join syntax. There are plenty of good reasons to switch to the “new” syntax, so I would advise you do.
  2. Parameterised queries are about more than just SQL Injection attacks (including but not limited to type safety and query plan caching), so just because your input isn’t coming from a user doesn’t mean you shouldn’t use parametrized queries.
  3. I would strongly advise against re-using your SqlClient objects (SqlConnection, SqlCommand), create a new object for each use, and dispose of it correctly when done.
User contributions licensed under: CC BY-SA
10 People found this is helpful
Advertisement