Skip to content
Advertisement

How to optimize datetime comparisons in mysql in where clause

CONTEXT

I have a large table full of “documents” that are updated by outside sources. When I notice the updates are more recent than my last touchpoint I need to address these documents. I’m having some serious performance issues though.

EXAMPLE CODE

gets me back 212,494,397 documents in 1 min 15.24 sec.

which is apx the actual query gets me 55,988,860 in 14 min 36.23 sec.

notably takes about 15 minutes as well. (this was surprising to me)

THE PROBLEM

How do I perform the updated_at > last_indexed_at in a more reasonable time?

DETAILS

I’m pretty certain that my query is, in some way, not sargable. Unfortunately, I can’t find what about this query prevents it from being executed on a row independent basis.

doesn’t do any better.

nor does

nor does

EDIT: FOLLOW UP REQUESTED DATA

This question only involves one table (thankfully) in a rails project, so we conveniently have the rails definition for the table.

SELECT VERSION(); got me 5.7.27-30-log

And probably most import,

gets me exactly

Advertisement

Answer

Oh! MySQL 5.7 introduced Generated Columns — which gives us a way of indexing expressions! 🥳

If you do something like this:

…and change the query to:

…what results do you get?

Hopefully, we’re moving the work of evaluating COALESCE(updated_at > last_indexed_at, TRUE) from Read time to Write time.

User contributions licensed under: CC BY-SA
4 People found this is helpful
Advertisement