I have following query
SELECT SUBSTRING(a0_.created_date FROM 1 FOR 10) AS sclr_0, COUNT(1) AS sclr_1 FROM applications a0_ INNER JOIN package_codes p1_ ON a0_.id = p1_.application_id WHERE a0_.created_date BETWEEN '2019-01-01' AND '2020-01-01' AND p1_.type = 'Package 1' GROUP BY sclr_0
— EDIT —
Most of you have focused on GROUP BY and SUBSTRING, but this is not the root of problem.
Following query have the same execution time:
SELECT COUNT(1) AS sclr_1 FROM applications a0_ INNER JOIN package_codes p1_ ON a0_.id = p1_.application_id WHERE a0_.created_date BETWEEN '2019-01-01' AND '2020-01-01' AND p1_.type = 'Package 1'
— EDIT 2 —
After adding index on applications.created_date and forcing the query to use specified indexes as @DDS suggests the execution time drops to ~750ms
The current query looks like:
SELECT SUBSTRING(a0_.created_date FROM 1 FOR 10) AS sclr_0, COUNT(1) AS sclr_1 FROM applications a0_ USE INDEX (applications_created_date_idx) INNER JOIN package_codes p1_ USE INDEX (PRIMARY, UNIQ_70A9C6AA3E030ACD, package_codes_type_idx) ON a0_.id = p1_.application_id WHERE a0_.created_date BETWEEN '2019-01-01' AND '2020-01-01' AND p1_.type = 'Package 1' GROUP BY sclr_0
— EDIT 3 —
I have found that using to much indexes in query may cause that in some cases MySQL will use non optimal indexes, so the final query should looks as following:
SELECT SUBSTRING(a0_.created_date FROM 1 FOR 10) AS sclr_0, COUNT(1) AS sclr_1 FROM applications a0_ USE INDEX (applications_created_date_idx) INNER JOIN package_codes p1_ USE INDEX (package_codes_application_idx) ON a0_.id = p1_.application_id WHERE a0_.created_date BETWEEN '2019-01-01' AND '2020-01-01' AND p1_.type = 'Package 1' GROUP BY sclr_0
— END EDIT —
package_codes contains over 100.000.000 records.
applications contains over 250.000 records.
The query takes 2 minutes to get result. Is there any way to optimize it? I’m stuck on MySQL 5.5.
Tables:
CREATE TABLE `applications` ( `id` int(11) NOT NULL, `created_date` datetime NOT NULL, `name` varchar(64) COLLATE utf8mb4_unicode_ci NOT NULL, `surname` varchar(64) COLLATE utf8mb4_unicode_ci NOT NULL ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci; ALTER TABLE `applications` ADD PRIMARY KEY (`id`), ADD KEY `applications_created_date_idx` (`created_date`); ALTER TABLE `applications` MODIFY `id` int(11) NOT NULL AUTO_INCREMENT;
CREATE TABLE `package_codes` ( `id` int(11) NOT NULL, `application_id` int(11) DEFAULT NULL, `created_date` datetime NOT NULL, `type` varchar(50) COLLATE utf8mb4_unicode_ci NOT NULL, `code` varchar(50) COLLATE utf8mb4_unicode_ci NOT NULL, `disabled` tinyint(1) NOT NULL DEFAULT '0', `meta_data` longtext COLLATE utf8mb4_unicode_ci ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci; ALTER TABLE `package_codes` ADD PRIMARY KEY (`id`), ADD UNIQUE KEY `UNIQ_70A9C6AA3E030ACD` (`application_id`), ADD KEY `package_codes_code_idx` (`code`), ADD KEY `package_codes_type_idx` (`type`), ADD KEY `package_codes_application_idx` (`application_id`), ADD KEY `package_codes_code_application_idx` (`code`,`application_id`); ALTER TABLE `package_codes` MODIFY `id` int(11) NOT NULL AUTO_INCREMENT; ALTER TABLE `package_codes` ADD CONSTRAINT `FK_70A9C6AA3E030ACD` FOREIGN KEY (`application_id`) REFERENCES `applications` (`id`);
Advertisement
Answer
After adding index on applications.created_date and forcing the query to use specified indexes as @DDS suggests the execution time drops to ~750ms
The final query should look as following:
SELECT SUBSTRING(a0_.created_date FROM 1 FOR 10) AS sclr_0, COUNT(1) AS sclr_1 FROM applications a0_ USE INDEX (applications_created_date_idx) INNER JOIN package_codes p1_ USE INDEX (package_codes_application_idx) ON a0_.id = p1_.application_id WHERE a0_.created_date BETWEEN '2019-01-01' AND '2020-01-01' AND p1_.type = 'Package 1' GROUP BY sclr_0