Skip to content
Advertisement

Filter and Match Column on a List of Substrings

I am attempting to filter a table of user actions on a list of specific actions. For example, given the following data

+--------+--------------------------------------------------------------+
| UserId | UserAction                                                   |
+--------+--------------------------------------------------------------+
| 12345  | Account Creation Successful                                  |
+--------+--------------------------------------------------------------+
| 23456  | Attempt to create duplicate account with email abc@email.com |
+--------+--------------------------------------------------------------+
| 34567  | Account Creation Unuccessful                                 |
+--------+--------------------------------------------------------------+
| 45678  | Attempt to create duplicate account with email xyz@email.com |
+--------+--------------------------------------------------------------+
| 56789  | Email Sent                                                   |
+--------+--------------------------------------------------------------+

I want all rows where the UserAction contains the strings “Account Creation” or “Attempt to create duplicate account”.

I was able to achieve this with the following Linq, but the query’s performance caused the webpage to timeout and blocked other queries to the table.

var actions = new HashSet<string>
{
    "Account Creation",
    "Attempt to create duplicate account"
};

using (var databaseContext = new DatabaseContext())
{
    var logs = databaseContext.Actions.Where(log => actions.Any(a => log.UserAction.Contains(a)));
}

To optimize this, I wanted to use an entirely SQL solution. One way of doing so could be using multiple

WHERE [UserAction] LIKE 'Account Creation%' OR [UserAction] LIKE 'Attempt to create duplicate account%' OR...

but I need to filter on many different actions, not just the two in this example, causing too long of a query with multiple OR LIKE statements.

Finally, I would like to use something similar to

WHERE [UserAction] IN ('Account Creation', 'Attempt to create duplicate account')

but the problem is the list contains substrings which will be found in the UserAction column. It would be great if I could use the entire UserAction string in the list used by the IN operator, but because some strings have variable data (like the email address), I need to match substrings.

So, what I am looking for is a way to achieve the hypothetical

WHERE ('Account Creation', 'Attempt to create duplicate account') IN [UserAction]

where a list of substrings can be matched on the values of a column in SQL.

Thanks for your time.

The following SQL can be used to reproduce the situation.

DROP TABLE IF EXISTS [MatchListOnList].[dbo].[Actions]

CREATE TABLE [MatchListOnList].[dbo].[Actions] (
    UserId varchar(16),
    UserAction varchar(128)
)

INSERT INTO [MatchListOnList].[dbo].[Actions] (
    UserId,
    UserAction
)
VALUES
(
    '12345',
    'Account Creation Successful'
),
(
    '23456',
    'Attempt to create duplicate account with email abc@email.com'
),
(
    '34567',
    'Account Creation Unsuccessful'
),
(
    '45678',
    'Attempt to create duplicate account with email xyz@email.com'
),
(
    '56789',
    'Email Sent'
),
(
    '67890',
    'Email Sent'
),
(
    '78901',
    'Email Sent'
)

Advertisement

Answer

I got it to work; it no longer times out and doesn’t block any other services.

I did this by building an SQL query and running it directly against the database.

var actions = new HashSet<string>
{
    "Account Creation",
    "Attempt to create duplicate account"
};

using (var databaseContext = new DatabaseContext())
{
    var query = "SELECT * FROM [dbo].[Actions] WHERE [UserAction] LIKE ";
    var queryStringActions = actions
        .Select(action => $"'{action}%'");

    query += $"{string.Join(" OR [UserAction] LIKE ", queryStringActions)}";

    var logs = databaseContext.Database.SqlQuery<Entities.Action>(query).AsQueryable();
}

The resulting SQL query is as follows

SELECT * FROM [dbo].[Actions]
WHERE [UserAction] LIKE 'Account Creation%'
OR [UserAction] LIKE 'Attempt to create duplicate account%'
User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement