I’m just beginning to learn about INDEX and I’ve heard it could speed up results but my testing is having the opposite effect.
Some points about the data:
- My company uses a SQL Server but I don’t know much else beyond that
- I’m just an employee and don’t have any sort of admin access and am even restricted to seeing into some folders
- #TABLE1 contains over 23,000 lines and runs in 1 second
- #TABLE2 contains over 3.1M lines and runs in about 1.5 minutes
- Using a join takes about 1.75 to 2.5 minutes
- My attempt as using INDEX takes 2.5 to 3 or more
- There are more tables I’ll be joining afterwards but these 2 are essentially the foundation for everything else
The query I tried below takes longer than if I did a normal JOIN
IF OBJECT_ID('tempdb..#TABLE1') IS NOT NULL DROP TABLE #TABLE1; IF OBJECT_ID('tempdb..#TABLE2') IS NOT NULL DROP TABLE #TABLE2; SELECT cast(T1.[ID] as varchar(20)) as 'ID', cast(T1.[Division] as int) as 'Division', cast(T1.[Category] as int) as 'Category', cast(T1.[Platform] as int) as 'Platform', cast(T1.[Condition] as tinyint) as 'Condition', cast(T1.[First Received] as date) as 'First Received', cast(T1.[Last Received] as date) as 'Last Received' INTO #TABLE1 FROM CompanyTable as T1 WHERE T1.[Name] = 'Canada' AND T1.[Division] = '100'; CREATE NONCLUSTERED INDEX IX_TABLE1 ON #TABLE1([ID], [Division], [Category], [Platform]); SELECT DISTINCT SE.[Date] as 'Date', SE.[ID] as 'ID', SE.[Division] as 'Division', SE.[Category] as 'Category', SE.[Platform] as 'Platform', sum(SE.[Units]) as 'Units', sum(SE.[Sales]) as 'Sales', sum(SE.[Retail]) as 'Retail', sum(SE.[Cost]) as 'Cost' INTO #TABLE2 FROM (SELECT cast(S1.[Date] as date) as 'Date', cast(S1.[ID] as varchar(20)) as 'ID', cast(S1.[Division] as int) as 'Division', cast(S1.[Category] as int) as 'Category', cast(S1.[Platform] as int) as 'Platform', cast(sum(S1.[Quantity]) * -1 as decimal(38,20)) as 'Units', cast(sum(S1.[Net Amount]) * -1 as decimal(38,20)) as 'Sales', cast(sum(S1.[Cost Amount]) * -1 as decimal(38,20)) as 'Cost', cast(sum(S1.[Price]) as decimal(38,20)) as 'Retail' FROM SalesTable1 as S1 WHERE S1.[Division] = '100' GROUP BY S1.[ID], S1.[Date], S1.[Division], S1.[Category], S1.[Platform] UNION ALL SELECT cast(S2.[Date] as date) as 'Date', cast(S2.[ID] as varchar(20)) as 'ID', cast(S2.[Division] as int) as 'Division', cast(S2.[Category] as int) as 'Category', cast(S2.[Platform] as int) as 'Platform', cast(sum(S2.[Quantity]) * -1 as decimal(38,20)) as 'Units', cast(sum(S2.[Net Amount]) * -1 as decimal(38,20)) as 'Sales', cast(sum(S2.[Cost Amount]) * -1 as decimal(38,20)) as 'Cost', cast(sum(S2.[Price]) as decimal(38,20)) as 'Retail' FROM SalesTable2 as S2 WHERE S2.[Division] = '100' GROUP BY S2.[ID], S2.[Date], S2.[Division], S2.[Category], S2.[Platform] ) as T2 GROUP BY T2.[ID], T2.[Date], T2.[Division], T2.[Category], SE.[Platform] CREATE NONCLUSTERED INDEX IX_TABLE2 ON #TABLE2([ID], [Division], [Category], [Platform]); SELECT * FROM #TABLE1 as T1 JOIN #TABLE2 as T2 ON T1.[ID] = T2.[ID] AND T1.[Division] = T2.[Division] AND T1.[Category] = T2.[Category] AND T1.[Platform] = T2.[Platform] DROP INDEX IX_TABLE1 ON #TABLE1; DROP INDEX IX_TABLE2 ON #TABLE2;
Am I doing something wrong that’s causing it to go slower?
Advertisement
Answer
You need a primary key on your temp tables with unique values to use as a CLUSTERED INDEX
. Then you only need one other NONCLUSTERED INDEX
on the JOIN
columns. Assuming ID
in #TABLE1
is unique, create your indexes like so:
ALTER TABLE #TABLE1 ADD CONSTRAINT PK_TABLE1 PRIMARY KEY CLUSTERED ([ID]); CREATE NONCLUSTERED INDEX IX_TABLE1 ON #TABLE1([Division], [Category], [Platform]);
Ditto for #TABLE2
, again assuming ID
id unique:
ALTER TABLE #TABLE2 ADD CONSTRAINT PK_TABLE2 PRIMARY KEY CLUSTERED ([ID]); CREATE NONCLUSTERED INDEX IX_TABLE2 ON #TABLE2([Division], [Category], [Platform]);
However, this may not solve your performance issue, if the actual bottleneck is the non-temp tables. You should run the SELECT
portions of your script separately. If they are performing well on their own, then try these indexes. If not, then you need to speak to your DBA and find out why the SELECT
is running slow.