I’m just beginning to learn about INDEX and I’ve heard it could speed up results but my testing is having the opposite effect.
Some points about the data:
- My company uses a SQL Server but I don’t know much else beyond that
- I’m just an employee and don’t have any sort of admin access and am even restricted to seeing into some folders
- #TABLE1 contains over 23,000 lines and runs in 1 second
- #TABLE2 contains over 3.1M lines and runs in about 1.5 minutes
- Using a join takes about 1.75 to 2.5 minutes
- My attempt as using INDEX takes 2.5 to 3 or more
- There are more tables I’ll be joining afterwards but these 2 are essentially the foundation for everything else
The query I tried below takes longer than if I did a normal JOIN
IF OBJECT_ID('tempdb..#TABLE1') IS NOT NULL DROP TABLE #TABLE1;
IF OBJECT_ID('tempdb..#TABLE2') IS NOT NULL DROP TABLE #TABLE2;
SELECT
cast(T1.[ID] as varchar(20)) as 'ID',
cast(T1.[Division] as int) as 'Division',
cast(T1.[Category] as int) as 'Category',
cast(T1.[Platform] as int) as 'Platform',
cast(T1.[Condition] as tinyint) as 'Condition',
cast(T1.[First Received] as date) as 'First Received',
cast(T1.[Last Received] as date) as 'Last Received'
INTO #TABLE1
FROM
CompanyTable as T1
WHERE
T1.[Name] = 'Canada'
AND T1.[Division] = '100';
CREATE NONCLUSTERED INDEX IX_TABLE1 ON #TABLE1([ID], [Division], [Category], [Platform]);
SELECT DISTINCT
SE.[Date] as 'Date',
SE.[ID] as 'ID',
SE.[Division] as 'Division',
SE.[Category] as 'Category',
SE.[Platform] as 'Platform',
sum(SE.[Units]) as 'Units',
sum(SE.[Sales]) as 'Sales',
sum(SE.[Retail]) as 'Retail',
sum(SE.[Cost]) as 'Cost'
INTO #TABLE2
FROM
(SELECT
cast(S1.[Date] as date) as 'Date',
cast(S1.[ID] as varchar(20)) as 'ID',
cast(S1.[Division] as int) as 'Division',
cast(S1.[Category] as int) as 'Category',
cast(S1.[Platform] as int) as 'Platform',
cast(sum(S1.[Quantity]) * -1 as decimal(38,20)) as 'Units',
cast(sum(S1.[Net Amount]) * -1 as decimal(38,20)) as 'Sales',
cast(sum(S1.[Cost Amount]) * -1 as decimal(38,20)) as 'Cost',
cast(sum(S1.[Price]) as decimal(38,20)) as 'Retail'
FROM
SalesTable1 as S1
WHERE
S1.[Division] = '100'
GROUP BY
S1.[ID], S1.[Date], S1.[Division], S1.[Category], S1.[Platform]
UNION ALL
SELECT
cast(S2.[Date] as date) as 'Date',
cast(S2.[ID] as varchar(20)) as 'ID',
cast(S2.[Division] as int) as 'Division',
cast(S2.[Category] as int) as 'Category',
cast(S2.[Platform] as int) as 'Platform',
cast(sum(S2.[Quantity]) * -1 as decimal(38,20)) as 'Units',
cast(sum(S2.[Net Amount]) * -1 as decimal(38,20)) as 'Sales',
cast(sum(S2.[Cost Amount]) * -1 as decimal(38,20)) as 'Cost',
cast(sum(S2.[Price]) as decimal(38,20)) as 'Retail'
FROM
SalesTable2 as S2
WHERE
S2.[Division] = '100'
GROUP BY
S2.[ID], S2.[Date], S2.[Division], S2.[Category], S2.[Platform]
) as T2
GROUP BY
T2.[ID], T2.[Date], T2.[Division], T2.[Category], SE.[Platform]
CREATE NONCLUSTERED INDEX IX_TABLE2 ON #TABLE2([ID], [Division], [Category], [Platform]);
SELECT *
FROM #TABLE1 as T1
JOIN #TABLE2 as T2 ON T1.[ID] = T2.[ID] AND T1.[Division] = T2.[Division] AND T1.[Category] = T2.[Category] AND T1.[Platform] = T2.[Platform]
DROP INDEX IX_TABLE1 ON #TABLE1;
DROP INDEX IX_TABLE2 ON #TABLE2;
Am I doing something wrong that’s causing it to go slower?
Advertisement
Answer
You need a primary key on your temp tables with unique values to use as a CLUSTERED INDEX
. Then you only need one other NONCLUSTERED INDEX
on the JOIN
columns. Assuming ID
in #TABLE1
is unique, create your indexes like so:
ALTER TABLE #TABLE1 ADD CONSTRAINT PK_TABLE1 PRIMARY KEY CLUSTERED ([ID]);
CREATE NONCLUSTERED INDEX IX_TABLE1 ON #TABLE1([Division], [Category], [Platform]);
Ditto for #TABLE2
, again assuming ID
id unique:
ALTER TABLE #TABLE2 ADD CONSTRAINT PK_TABLE2 PRIMARY KEY CLUSTERED ([ID]);
CREATE NONCLUSTERED INDEX IX_TABLE2 ON #TABLE2([Division], [Category], [Platform]);
However, this may not solve your performance issue, if the actual bottleneck is the non-temp tables. You should run the SELECT
portions of your script separately. If they are performing well on their own, then try these indexes. If not, then you need to speak to your DBA and find out why the SELECT
is running slow.