I was trying to change an update statement where there is a subquery in the WHERE clause to a join in order to improve performance. DDL to sample table/data
CREATE TABLE [dbo].[Table1]( [pre_plan_id] [smallint] NULL, [pre_type_id] [smallint] NULL, [associate_id] [smallint] NOT NULL, [deleted] [bit] NOT NULL ) INSERT INTO Table1 VALUES (NULL, NULL, -32768, 0), (1, NULL, 2, 1), (1, NULL, 3, 0), (NULL, NULL, 3109, 0), (1, NULL, 3109, 1) CREATE TABLE [dbo].[Table2]( [type_id] [smallint] NOT NULL, [plan_id] [smallint] NOT NULL, [associate_id] [smallint] NOT NULL, [time_in] [smalldatetime] NOT NULL ) INSERT INTO Table2 VALUES (390, 31, 3109, '2009-09-02'), (304, 32, 3109, '2010-02-05'), (388, 31, 3109, '2010-09-24')
The query that uses a subquery:
SELECT pre_plan_id, pre_type_id FROM Table1 WHERE pre_plan_id =1 AND associate_id NOT IN (SELECT TOP 2 associate_id FROM Table2 WHERE time_in= '2010-09-24 00:00:00' group by associate_id order by count(*) desc)
My attempt of converting it to JOIN
SELECT pre_plan_id FROM (SELECT pre_plan_id, pre_type_id, rn FROM Table1 a LEFT JOIN ( select associate_id, Row_number() over (partition by associate_id order by count(*) desc ) rn FROM Table2 WHERE time_in= '2010-09-24 00:00:00' Group by associate_id) b ON a.associate_id = b.associate_id where b.rn <> 1) a where pre_plan_id = 1
However, this displays nothing while I expect two rows; and it is happening because of b.rn <> 1 I expected it would display the NULL values when it is b.rn <> 1
Any explanation for this? Any guide on better approach of tuning the query is much appreciated.
Thanks.
Advertisement
Answer
You can just move the subquery to the FROM
clause and use LEFT JOIN
:
SELECT t1.pre_plan_id, t1.pre_type_id FROM Table1 t1 LEFT JOIN (SELECT TOP 2 t2.associate_id FROM Table2 WHERE t2.time_in= '2010-09-24 00:00:00' GROUP BY associate_id ORDER BY COUNT(*) DESC ) t2 ON t2.associate_id = t1.associate_id WHERE t1.pre_plan_id = 1 AND t2.associate_id IS NULL;