Skip to content
Advertisement

DELETE and NOT IN query performance

I have following table structure:

Users

  • UserId (primary key)
  • UserName

SomeItems

  • SomeId(FK to Users.UserId)
  • SomeItemId (primary key)
  • Name
  • Other stuff…

SomeOtherItems

  • SomeId2 (FK to Users.UserId)
  • SomeOtherItemId (primary key)
  • Name
  • Other stuff…

I want to delete records from Users table which do not exist in SomeItems and SomeOtherItems tables.

I can do this:

DELETE from Users
FROM Users u
WHERE u.UserId NOT IN
   (SELECT DISTINCT SomeId FROM SomeItems

    UNION

    SELECT DISTINCT SomeId2 FROM SomeOtherItems)

However, it is very slow. I assume it executes the UNION query for every record, doesn’t it? Is there any way to improve the performance?

Advertisement

Answer

I would change the IN for two joins, the simpler the query, the easier is to optimize for the engine.

DELETE from U
FROM Users U
     left join SomeItems S1 on S1.SomeId = U.UserId
     left join SomeOtherItems S2 on S2.SomeID2 = U.UserId
WHERE S1.SomeId is null and S2.SomeID2 is null

Checking that S1.SomeID is null means that U.UserId was not present on SomeItems. Same for SomeOtherItems.

Be sure that you have indexes for SomeId and SomeID2 on SomeItems and SomeOtherItems.

User contributions licensed under: CC BY-SA
2 People found this is helpful
Advertisement