Skip to content
Advertisement

Optimize MySQL InnoDB query for max, count

I have an MySQL InnoDB table with 5.7M rows and 1.9GB size:

+-------------------+---------+------+-----+---------+----------------+
|       Field       |  Type   | Null | Key | Default |     Extra      |
+-------------------+---------+------+-----+---------+----------------+
| id                | int(20) | NO   | PRI | NULL    | auto_increment |
| listing_id        | int(20) | YES  |     | NULL    |                |
| listing_link      | text    | YES  |     | NULL    |                |
| transaction_title | text    | YES  |     | NULL    |                |
| image_thumb       | text    | YES  |     | NULL    |                |
| seller_link       | text    | YES  |     | NULL    |                |
| seller_name       | text    | YES  |     | NULL    |                |
| sale_date         | date    | YES  |     | NULL    |                |
+-------------------+---------+------+-----+---------+----------------+

Here are my my.ini settings for my 3GB RAM server:

key_buffer = 16M
max_allowed_packet = 16M
sort_buffer_size = 8M
net_buffer_length = 8K
read_buffer_size = 2M
read_rnd_buffer_size = 16M
myisam_sort_buffer_size = 8M
log_error = "mysql_error.log"
innodb_autoinc_lock_mode=0
join_buffer_size = 8M
thread_cache_size = 8
thread_concurrency = 8
query_cache_size = 64M
query_cache_limit = 2M
ft_min_word_len = 4
thread_stack = 192K
tmp_table_size = 64M

innodb_buffer_pool_size = 2G
innodb_additional_mem_pool_size = 16M
innodb_log_file_size = 512M
innodb_log_buffer_size = 8M
innodb_flush_log_at_trx_commit = 1
innodb_lock_wait_timeout = 120
innodb_write_io_threads = 8
innodb_read_io_threads = 8
innodb_thread_concurrency = 16
innodb_log_files_in_group = 3
innodb_max_dirty_pages_pct = 90

When I run next query it takes over 20 minutes to return the results:

SELECT transaction_title, 
       listing_id, 
       seller_name, 
       Max(sale_date) AS sale_date, 
       Count(*)       AS count 
FROM   sales_meta 
WHERE `sale_date` BETWEEN '2017-06-06' AND '2017-06-06' 
GROUP  BY listing_id 
HAVING Count(*) > 1 
ORDER  BY count DESC, 
          seller_name;

I’ve done some research and it appears I need to add some indexes to speed things up, but I am confused how to go about it. There are some single-column indexes and some multi-column indexes, which one should I do?

To make things more complicated, there are few other queries that I will need to perform on this table regularly:

SELECT * 
FROM   sales_meta 
WHERE ` sale_date `= '2017-06-06'; 

and

SELECT DISTINCT `seller_name` 
FROM   `sales_meta`; 

These two are probably less taxing, but I still need to optimize for them as well if possible, although the first query out of three is the top priority for now.

Advertisement

Answer

if you want just the value for a single day and the the data type is date then you could avoid the between clause and use =

    SELECT transaction_title, 
           listing_id, 
           seller_name, 
           Max(sale_date) AS max_sale_date, 
           Count(*)       AS count 
    FROM   sales_meta 
    WHERE sale_date =  str_to_date('2017-06-06', '%Y-%m-%d')  
    GROUP  BY listing_id 
    HAVING Count(*) > 1 
    ORDER  BY count DESC, seller_name;

and be sure you have an index on sale_date

Advertisement