Skip to content
Advertisement

Mysql GROUP BY too slow. Any help to make it faster?

So I have a JS script which people embed into their sites and it tracks all the URLs and clicks of visitors. Each visitor gets “Token” which is unique value to them and is used to track their actions on the site.

I wanted to show owners of the site actions of their visitors so I wrote following query. All the visits are stored in “custom_logs” table.

SELECT *
FROM custom_logs
WHERE pn = 'pn-9283896662' AND
      id IN (SELECT MAX(id)
             FROM custom_logs
             WHERE action_clicked_text LIKE '%sometext%'  
             GROUP BY token
           )
      AND token != '' AND
      action_timestamp > 11568 AND
      action_timestamp < 1570846368
 order by action_timestamp desc
 LIMIT 0, 30;

I narrowed the problem to “GROUP BY token” part of the query. When I remove this part query runs way faster, but still pretty slow (0.7s compared to 5s with Group By part), and there are 4 of those queries in one page. In custom_logs table there is already like 250 000 rows.

“GROUP BY token” part is there because I want to show people only one and only newest log of each user so they can click on it and see the complete logs of said user. So after the query I use mysqli_num_rows($results); to count the results. Is there any other way besides caching to quickly count all the results?

I read something about indexing columns so I made token varchar(255) and made it indexed but it did nothing in terms of speed. But also I am not so good in SQL at all.

Advertisement

Answer

I would suggest replacing in with a correlated subquery. I’m thinking:

SELECT cl.*
FROM custom_logs cl
WHERE cl.pn = 'pn-9283896662' AND
      cl.id = (select max(cl2.id)
               from customer_logs cl2
               where cl2.token = cl.token and
                     cl2.action_clicked_text LIKE '%sometext%'
              ) and
      cl.token <> '' AND
      cl.action_timestamp > 11568 AND
      cl.action_timestamp < 1570846368
order by cl.action_timestamp desc
limit 0, 30;

And for this, I recommend the following indexes:

  • customer_logs(pn, action_timestamp, token, id)
  • customer_logs(token, action_clicked_text, id)
User contributions licensed under: CC BY-SA
8 People found this is helpful
Advertisement