I have an MySQL InnoDB table with 5.7M rows and 1.9GB size:
+-------------------+---------+------+-----+---------+----------------+ | Field | Type | Null | Key | Default | Extra | +-------------------+---------+------+-----+---------+----------------+ | id | int(20) | NO | PRI | NULL | auto_increment | | listing_id | int(20) | YES | | NULL | | | listing_link | text | YES | | NULL | | | transaction_title | text | YES | | NULL | | | image_thumb | text | YES | | NULL | | | seller_link | text | YES | | NULL | | | seller_name | text | YES | | NULL | | | sale_date | date | YES | | NULL | | +-------------------+---------+------+-----+---------+----------------+
Here are my my.ini settings for my 3GB RAM server:
key_buffer = 16M max_allowed_packet = 16M sort_buffer_size = 8M net_buffer_length = 8K read_buffer_size = 2M read_rnd_buffer_size = 16M myisam_sort_buffer_size = 8M log_error = "mysql_error.log" innodb_autoinc_lock_mode=0 join_buffer_size = 8M thread_cache_size = 8 thread_concurrency = 8 query_cache_size = 64M query_cache_limit = 2M ft_min_word_len = 4 thread_stack = 192K tmp_table_size = 64M innodb_buffer_pool_size = 2G innodb_additional_mem_pool_size = 16M innodb_log_file_size = 512M innodb_log_buffer_size = 8M innodb_flush_log_at_trx_commit = 1 innodb_lock_wait_timeout = 120 innodb_write_io_threads = 8 innodb_read_io_threads = 8 innodb_thread_concurrency = 16 innodb_log_files_in_group = 3 innodb_max_dirty_pages_pct = 90
When I run next query it takes over 20 minutes to return the results:
SELECT transaction_title, listing_id, seller_name, Max(sale_date) AS sale_date, Count(*) AS count FROM sales_meta WHERE `sale_date` BETWEEN '2017-06-06' AND '2017-06-06' GROUP BY listing_id HAVING Count(*) > 1 ORDER BY count DESC, seller_name;
I’ve done some research and it appears I need to add some indexes to speed things up, but I am confused how to go about it. There are some single-column indexes and some multi-column indexes, which one should I do?
To make things more complicated, there are few other queries that I will need to perform on this table regularly:
SELECT * FROM sales_meta WHERE ` sale_date `= '2017-06-06';
and
SELECT DISTINCT `seller_name` FROM `sales_meta`;
These two are probably less taxing, but I still need to optimize for them as well if possible, although the first query out of three is the top priority for now.
Advertisement
Answer
if you want just the value for a single day and the the data type is date then you could avoid the between clause and use =
SELECT transaction_title, listing_id, seller_name, Max(sale_date) AS max_sale_date, Count(*) AS count FROM sales_meta WHERE sale_date = str_to_date('2017-06-06', '%Y-%m-%d') GROUP BY listing_id HAVING Count(*) > 1 ORDER BY count DESC, seller_name;
and be sure you have an index on sale_date