Skip to content
Advertisement

MySQL GROUP BY slows down query x1000 times

I’m struggling with setting up proper, effective index for my Django application which uses MySQL database. The problem is about article table which for now has a little more than 1 million rows and querying isn’t as fast as we want.

Article table structure looks more or less like below:

After many tries I came that below index gives best performance:

And the problematic query is:

NOTE: that I have to use the index explicitly, otherwise query is muuch slower. This query takes about 1.4s.

But when I only remove GROUP BY statement the query takes acceptable 1-10ms. I was trying to add newsarticle ID to index at different positions but without a luck.

This is output from EXPLAIN (from Django):

Interesting that same query gives different EXPLAIN in MySQL Workbench and in Django debug toolbar(if you want I can paste EXPLAIN from workbench as well). But the performance is more or less the same. Do you maybe have an idea how to enhance index so it can search quickly?

Thanks

EDIT: I paste here EXPLAIN from MySQL Workbench which is different but seems to be more real (not sure why Django debug toolbar explain differently)

EDIT2: Below is EXPLAIN when I remove GROUP BY from the query (used MySQL Workbench):

EDIT3:

After applying changes suggested by Rick (Thanks!):

newsarticle(id, online, main_article_of_duplicate_group, date_published) two index for newsarticle_topics (newstopic_id, newsarticle_id) and (newsarticle_id, newstopic_id)

WITH USE_INDEX (takes 1.2s)

EXPLAIN:

WITHOUT USE_INDEX clause (takes 2.6s)

For comparison index – newsarticle(date_published DESC, main_article_of_duplicate_group, source_id, online) with USE INDEX (takes only 1-3ms!)

Advertisement

Answer

Is main_article_of_duplicate_group a true/false flag?

If the Optimizer chooses to start with newsarticle_topics:

If newsarticle_topics is a many-to-many mapping table, get rid of id and make the PRIMARY KEY be that pair, plus a secondary index in the opposite direction. More discussion: http://mysql.rjweb.org/doc.php/index_cookbook_mysql#many_to_many_mapping_table

If the Optimizer chooses to start with newsarticle (which seems more likely):

Meanwhile, newsarticlefeedback needs this, in the order given:

Instead of

have

User contributions licensed under: CC BY-SA
5 People found this is helpful
Advertisement