I have a performance problem with a stored procedure. Because when I check my benchmark’s result I realized that “MatchxxxReferencesByIds” has ‘240.25’ ms Average LastElapsedTimeInSecond. How can I improve my procedure?
ALTER PROCEDURE [Common].[MatchxxxReferencesByIds] (@refxxxIds VARCHAR(MAX), @refxxxType NVARCHAR(250)) BEGIN SET NOCOUNT ON; BEGIN TRAN DECLARE @fake_tbl TABLE (xxxid NVARCHAR(50)) INSERT INTO @fake_tbl SELECT LTRIM(RTRIM(split.a.value('.', 'NVARCHAR(MAX)'))) AS fqdn FROM (SELECT CAST ('<M>' + REPLACE(@refxxxIds, ',', '</M><M>') + '</M>' AS XML) AS data ) AS a CROSS APPLY data.nodes ('/M') AS split(a) SELECT [p].[ReferencedxxxId] FROM [Common].[xxxReference] AS [p] WHERE ([p].[IsDeleted] = 0) AND (([p].[ReferencedxxxType] COLLATE Turkish_CI_AS = @refxxxType COLLATE Turkish_CI_AS ) AND [p].[ReferencedxxxId] COLLATE Turkish_CI_AS IN (SELECT ft.xxxid COLLATE Turkish_CI_AS FROM @fake_tbl ft)) COMMIT; END;
Advertisement
Answer
Based on the execution plan of the stored procedure, what makes it to perform slowly, is the part where you are going to work with XML.
Lets rethink about the solution:
I have created a table like this:
CREATE TABLE [Common].[EntityReference] ( IsDeleted BIT, ReferencedEntityType VARCHAR(100), ReferencedEntityId VARCHAR(10) ); GO
and manipulate it like this(Insert 1M records into it):
DECLARE @i INT = 1000000; DECLARE @isDeleted BIT, @ReferencedEntityType VARCHAR(100), @ReferencedEntityId VARCHAR(10); WHILE @i > 0 BEGIN SET @isDeleted =(SELECT @i % 2); SET @ReferencedEntityType = 'TEST' + CASE WHEN @i % 2 = 0 THEN '' ELSE CAST(@i % 2 AS VARCHAR(100)) END; SET @ReferencedEntityId = CAST(@i AS VARCHAR(10)); INSERT INTO [Common].[EntityReference] ( IsDeleted, ReferencedEntityType, ReferencedEntityId ) VALUES (@isDeleted, @ReferencedEntityType, @ReferencedEntityId); SET @i = @i - 1; END;
lets analyse your code:
You have a comma delimited input(@refEntityIds
), which you want to split it and then run a query against these values. (your SP’s subtree cost in my PC is about 376) To do so you have different approaches:
1.Pass a table variable to stored procedure which contains the refEntityIds
2.Make use of STRING_SPLIT function to split the string Lets see the sample query:
INSERT INTO @fake_tbl SELECT value FROM STRING_SPLIT(@refEntityIds, ',');
Using this, you will gain a great performance improvement in your code.(subtree cost: 6.19 without following indexes) BUT this feature is not available in SQL Server 2008!
You can use a replacement for this function(read this: https://stackoverflow.com/a/54926996/1666800) and changing your query to this(subtree cost still is about 6.19):
INSERT INTO @fake_tbl SELECT value FROM dbo.[fn_split_string_to_column](@refEntityIds,',')
In this case again you will see the notable performance improvement.
You can also create a non clustered index on [Common].[EntityReference]
table, which has a little performance improvement too. But please think about creating index, before creating it, it might have negative impact on your DML operations:
CREATE NONCLUSTERED INDEX [Index Name] ON [Common].[EntityReference] ( [IsDeleted] ASC ) INCLUDE ([ReferencedEntityType],[ReferencedEntityId])
In case that I have not this index(Suppose that I have replaced your split solution with mine), the subtree cost is: 6.19, When I add aforementioned index, the subtree cost decreased to 4.70, and finally when I change the index to following one, the subtree cost is 5.16
CREATE NONCLUSTERED INDEX [Index Name] ON [Common].[EntityReference] ( [ReferencedEntityType] ASC, [ReferencedEntityId] ASC ) INCLUDE ([IsDeleted])
Thanks to @PanagiotisKanavos following index will even performs better than the aforementioned ones (subtree cost: 3.95):
CREATE NONCLUSTERED INDEX IX_EntityReference_ReferenceEntityID ON Common.EntityReference (ReferencedEntityId) INCLUDE(ReferencedEntityType) WHERE IsDeleted =0;
Also please note that, using transaction against a local table variable has almost no effect, and probably you can simply ignore it.