I have been struggling with this for a while. I have a database with three tables (each of which has millions of records) as follows (removed some columns for simplicity): There exists indexes As follows To eliminate the possibility that the slowness is because of the selected columns, I only select a fixed value The following query runs very fast

SQL Server Query filter with order is slow

I have been struggling with this for a while.

I have a database with three tables (each of which has millions of records) as follows (removed some columns for simplicity):

1.[Entity]
    [Id]                UNIQUEIDENTIFIER PK,
    [EntityLevel_Id]    UNIQUEIDENTIFIER NOT NULL FK [EntityLevel] ([Id])
2.[EntityData]
    [Id]                UNIQUEIDENTIFIER PK,
    [Entity_Id]         UNIQUEIDENTIFIER NOT NULL FK [Entity] ([Id]),
    [DataLanguage_Id]   UNIQUEIDENTIFIER NOT NULL FK [Language] ([Id]),
    [Code]              NVARCHAR (250) NOT NULL
3.[EntityLevel]
    [Id]                UNIQUEIDENTIFIER PK,
    [Sort]              INT NOT NULL

​x
 
1.[Entity]    [Id]                UNIQUEIDENTIFIER PK,    [EntityLevel_Id]    UNIQUEIDENTIFIER NOT NULL FK [EntityLevel] ([Id])2.[EntityData]    [Id]                UNIQUEIDENTIFIER PK,    [Entity_Id]         UNIQUEIDENTIFIER NOT NULL FK [Entity] ([Id]),    [DataLanguage_Id]   UNIQUEIDENTIFIER NOT NULL FK [Language] ([Id]),    [Code]              NVARCHAR (250) NOT NULL3.[EntityLevel]    [Id]                UNIQUEIDENTIFIER PK,    [Sort]              INT NOT NULL​

There exists indexes As follows

[IX_Entity_EntityLevelId] ON [Entity] ([EntityLevel_Id])
[IX_EntityData_EntityId] ON [EntityData] ([Entity_Id])
[IX_EntityData_DataLanguageId_Code] ON [EntityData] ([DataLanguage_Id], [Code])
[IX_EntityLevel_Sort] ON [EntityLevel] ([Sort])

 
[IX_Entity_EntityLevelId] ON [Entity] ([EntityLevel_Id])[IX_EntityData_EntityId] ON [EntityData] ([Entity_Id])[IX_EntityData_DataLanguageId_Code] ON [EntityData] ([DataLanguage_Id], [Code])[IX_EntityLevel_Sort] ON [EntityLevel] ([Sort])​

To eliminate the possibility that the slowness is because of the selected columns, I only select a fixed value

The following query runs very fast (less than 1 second):

SELECT TOP 20
    1
FROM
    [Entity]
    INNER JOIN [EntityData] ON [Entity].[Id] = [EntityData].[Entity_Id]
    INNER JOIN [EntityLevel] ON [Entity].[EntityLevel_Id] = [EntityLevel].[Id]
WHERE
    [EntityData].[DataLanguage_Id] = 'B6930015-F177-4ED3-97B0-AAEF401F9265'

 
SELECT TOP 20    1FROM    [Entity]    INNER JOIN [EntityData] ON [Entity].[Id] = [EntityData].[Entity_Id]    INNER JOIN [EntityLevel] ON [Entity].[EntityLevel_Id] = [EntityLevel].[Id]WHERE    [EntityData].[DataLanguage_Id] = 'B6930015-F177-4ED3-97B0-AAEF401F9265'​

The following query also runs very fast:

SELECT TOP 20
    1
FROM
    [Entity]
    INNER JOIN [EntityData] ON [Entity].[Id] = [EntityData].[Entity_Id]
    INNER JOIN [EntityLevel] ON [Entity].[EntityLevel_Id] = [EntityLevel].[Id]
WHERE
    [EntityData].[DataLanguage_Id] = 'B6930015-F177-4ED3-97B0-AAEF401F9265'
ORDER BY
    [EntityData].[Code] ASC

 
SELECT TOP 20    1FROM    [Entity]    INNER JOIN [EntityData] ON [Entity].[Id] = [EntityData].[Entity_Id]    INNER JOIN [EntityLevel] ON [Entity].[EntityLevel_Id] = [EntityLevel].[Id]WHERE    [EntityData].[DataLanguage_Id] = 'B6930015-F177-4ED3-97B0-AAEF401F9265'ORDER BY    [EntityData].[Code] ASC​

And the following as well runs fast:

SELECT TOP 20
    1
FROM
    [Entity]
    INNER JOIN [EntityData] ON [Entity].[Id] = [EntityData].[Entity_Id]
    INNER JOIN [EntityLevel] ON [Entity].[EntityLevel_Id] = [EntityLevel].[Id]
WHERE
    [EntityLevel].[Sort] = 1

 
SELECT TOP 20    1FROM    [Entity]    INNER JOIN [EntityData] ON [Entity].[Id] = [EntityData].[Entity_Id]    INNER JOIN [EntityLevel] ON [Entity].[EntityLevel_Id] = [EntityLevel].[Id]WHERE    [EntityLevel].[Sort] = 1​

BUT, the following query runs VERY SLOW (roughly 10 seconds):

SELECT TOP 20
    1
FROM
    [Entity]
    INNER JOIN [EntityData] ON [Entity].[Id] = [EntityData].[Entity_Id]
    INNER JOIN [EntityLevel] ON [Entity].[EntityLevel_Id] = [EntityLevel].[Id]
WHERE
    [EntityData].[DataLanguage_Id] = 'B6930015-F177-4ED3-97B0-AAEF401F9265'
    AND
    [EntityLevel].[Sort] = 1
ORDER BY
    [EntityData].[Code]

 
SELECT TOP 20    1FROM    [Entity]    INNER JOIN [EntityData] ON [Entity].[Id] = [EntityData].[Entity_Id]    INNER JOIN [EntityLevel] ON [Entity].[EntityLevel_Id] = [EntityLevel].[Id]WHERE    [EntityData].[DataLanguage_Id] = 'B6930015-F177-4ED3-97B0-AAEF401F9265'    AND    [EntityLevel].[Sort] = 1ORDER BY    [EntityData].[Code]​

I don’t know the reason, and I don’t find any way to apply any more indexes so the query runs faster

Any help is appreciated!

Edit: the following queries also runs fast:

SELECT TOP 20
    1
FROM
    [Entity]
    INNER JOIN [EntityData] ON [Entity].[Id] = [EntityData].[Entity_Id]
    INNER JOIN [EntityLevel] ON [Entity].[EntityLevel_Id] = [EntityLevel].[Id]
WHERE
    [EntityData].[DataLanguage_Id] = 'B6930015-F177-4ED3-97B0-AAEF401F9265'
    AND
    [EntityLevel].[Sort] = 1

 
SELECT TOP 20    1FROM    [Entity]    INNER JOIN [EntityData] ON [Entity].[Id] = [EntityData].[Entity_Id]    INNER JOIN [EntityLevel] ON [Entity].[EntityLevel_Id] = [EntityLevel].[Id]WHERE    [EntityData].[DataLanguage_Id] = 'B6930015-F177-4ED3-97B0-AAEF401F9265'    AND    [EntityLevel].[Sort] = 1​

And

SELECT TOP 20
    1
FROM
    [Entity]
    INNER JOIN [EntityData] ON [Entity].[Id] = [EntityData].[Entity_Id]
    INNER JOIN [EntityLevel] ON [Entity].[EntityLevel_Id] = [EntityLevel].[Id]
WHERE
    [EntityLevel].[Sort] = 1
ORDER BY
    [EntityData].[Code]

 
SELECT TOP 20    1FROM    [Entity]    INNER JOIN [EntityData] ON [Entity].[Id] = [EntityData].[Entity_Id]    INNER JOIN [EntityLevel] ON [Entity].[EntityLevel_Id] = [EntityLevel].[Id]WHERE    [EntityLevel].[Sort] = 1ORDER BY    [EntityData].[Code]​

The problem appears only with order by and the two filters

Answer

For anyone who is concerned, this is the solution to the problem:

It turned out that with that much rows, what was taking time is to make key lookups and hash matches due to different indexes, so the solution was to add the following index to allow using of a single index on the table for all filter (and order by) values:

CREATE NONCLUSTERED INDEX [IX_EntityData_EntityId_DataLanguageId_IncCode] ON [EntityData] ([Entity_Id], [DataLanguage_Id]) INCLUDE ([Code])

 
CREATE NONCLUSTERED INDEX [IX_EntityData_EntityId_DataLanguageId_IncCode] ON [EntityData] ([Entity_Id], [DataLanguage_Id]) INCLUDE ([Code])​

But since I may also filter by code in some situations and not only use order by, so I modified the index to be like this:

CREATE NONCLUSTERED INDEX [IX_EntityData_EntityId_DataLanguageId_Code] ON [EntityData] ([Entity_Id], [DataLanguage_Id], [Code])

 
CREATE NONCLUSTERED INDEX [IX_EntityData_EntityId_DataLanguageId_Code] ON [EntityData] ([Entity_Id], [DataLanguage_Id], [Code])​

Advertisement

Answer