Skip to content
Advertisement

Joining Temp Tables Using Indexes

I’m just beginning to learn about INDEX and I’ve heard it could speed up results but my testing is having the opposite effect.

Some points about the data:

  • My company uses a SQL Server but I don’t know much else beyond that
  • I’m just an employee and don’t have any sort of admin access and am even restricted to seeing into some folders
  • #TABLE1 contains over 23,000 lines and runs in 1 second
  • #TABLE2 contains over 3.1M lines and runs in about 1.5 minutes
  • Using a join takes about 1.75 to 2.5 minutes
  • My attempt as using INDEX takes 2.5 to 3 or more
  • There are more tables I’ll be joining afterwards but these 2 are essentially the foundation for everything else

The query I tried below takes longer than if I did a normal JOIN

IF OBJECT_ID('tempdb..#TABLE1') IS NOT NULL DROP TABLE #TABLE1;
IF OBJECT_ID('tempdb..#TABLE2') IS NOT NULL DROP TABLE #TABLE2;

SELECT
cast(T1.[ID] as varchar(20)) as 'ID',
cast(T1.[Division] as int) as 'Division',
cast(T1.[Category] as int) as 'Category',
cast(T1.[Platform] as int) as 'Platform',
cast(T1.[Condition] as tinyint) as 'Condition',
cast(T1.[First Received] as date) as 'First Received',
cast(T1.[Last Received] as date) as 'Last Received'

INTO #TABLE1

FROM
CompanyTable as T1

WHERE
T1.[Name] = 'Canada'
AND T1.[Division] = '100';

CREATE NONCLUSTERED INDEX IX_TABLE1 ON #TABLE1([ID], [Division], [Category], [Platform]);

SELECT DISTINCT
SE.[Date] as 'Date',
SE.[ID] as 'ID',
SE.[Division] as 'Division',
SE.[Category] as 'Category',
SE.[Platform] as 'Platform',
sum(SE.[Units]) as 'Units',
sum(SE.[Sales]) as 'Sales',
sum(SE.[Retail]) as 'Retail',
sum(SE.[Cost]) as 'Cost'

INTO #TABLE2

FROM 
(SELECT
    cast(S1.[Date] as date) as 'Date',
    cast(S1.[ID] as varchar(20)) as 'ID',
    cast(S1.[Division] as int) as 'Division',
    cast(S1.[Category] as int) as 'Category',
    cast(S1.[Platform] as int) as 'Platform',
    cast(sum(S1.[Quantity]) * -1 as decimal(38,20)) as 'Units',
    cast(sum(S1.[Net Amount]) * -1 as decimal(38,20)) as 'Sales',
    cast(sum(S1.[Cost Amount]) * -1 as decimal(38,20)) as 'Cost',
    cast(sum(S1.[Price]) as decimal(38,20)) as 'Retail'

    FROM
    SalesTable1 as S1

    WHERE
    S1.[Division] = '100'
    
    GROUP BY
    S1.[ID], S1.[Date], S1.[Division], S1.[Category], S1.[Platform]
    
    UNION ALL
    
    SELECT
    cast(S2.[Date] as date) as 'Date',
    cast(S2.[ID] as varchar(20)) as 'ID',
    cast(S2.[Division] as int) as 'Division',
    cast(S2.[Category] as int) as 'Category',
    cast(S2.[Platform] as int) as 'Platform',
    cast(sum(S2.[Quantity]) * -1 as decimal(38,20)) as 'Units',
    cast(sum(S2.[Net Amount]) * -1 as decimal(38,20)) as 'Sales',
    cast(sum(S2.[Cost Amount]) * -1 as decimal(38,20)) as 'Cost',
    cast(sum(S2.[Price]) as decimal(38,20)) as 'Retail'
    
    FROM
    SalesTable2 as S2
    
    WHERE
    S2.[Division] = '100'
    
    GROUP BY
    S2.[ID], S2.[Date], S2.[Division], S2.[Category], S2.[Platform]
) as T2

GROUP BY
T2.[ID], T2.[Date], T2.[Division], T2.[Category], SE.[Platform]

CREATE NONCLUSTERED INDEX IX_TABLE2 ON #TABLE2([ID], [Division], [Category], [Platform]);

SELECT *
FROM #TABLE1 as T1
JOIN #TABLE2 as T2 ON T1.[ID] = T2.[ID] AND T1.[Division] = T2.[Division] AND T1.[Category] = T2.[Category] AND T1.[Platform] = T2.[Platform]

DROP INDEX IX_TABLE1 ON #TABLE1;
DROP INDEX IX_TABLE2 ON #TABLE2;

Am I doing something wrong that’s causing it to go slower?

Advertisement

Answer

You need a primary key on your temp tables with unique values to use as a CLUSTERED INDEX. Then you only need one other NONCLUSTERED INDEX on the JOIN columns. Assuming ID in #TABLE1 is unique, create your indexes like so:

ALTER TABLE #TABLE1 ADD CONSTRAINT PK_TABLE1 PRIMARY KEY CLUSTERED ([ID]);
CREATE NONCLUSTERED INDEX IX_TABLE1 ON #TABLE1([Division], [Category], [Platform]);

Ditto for #TABLE2, again assuming ID id unique:

ALTER TABLE #TABLE2 ADD CONSTRAINT PK_TABLE2 PRIMARY KEY CLUSTERED ([ID]);
CREATE NONCLUSTERED INDEX IX_TABLE2 ON #TABLE2([Division], [Category], [Platform]);

However, this may not solve your performance issue, if the actual bottleneck is the non-temp tables. You should run the SELECT portions of your script separately. If they are performing well on their own, then try these indexes. If not, then you need to speak to your DBA and find out why the SELECT is running slow.

User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement