Skip to content
Advertisement

Improving the performance of a MySQL left join sub query

I have the following MySQL query which calculates the total number of orders for each month within a given date range e.g. a year. The query works correctly, but the performance is slow (around 250ms).

Any ideas on how to rewrite it to make it more efficient?

WITH recursive `dates` AS (
    (
        SELECT '2019-11-28' AS item
    )
    UNION
    ALL (
        SELECT
            item + INTERVAL 1 DAY
        FROM
            `dates`
        WHERE
            item + INTERVAL 1 DAY <= '2020-11-27'
    )
)
SELECT
    DATE_FORMAT(`item`, '%b %y') AS `date`,
    COUNT(`orders`.`id`) AS `total`
FROM
    `dates`
    LEFT JOIN (
        SELECT
            `orders`.`id`,
            `orders`.`created_at`
        FROM
            `orders`
            INNER JOIN `locations` ON `orders`.`location_id` = `locations`.`id`
        WHERE
            `orders`.`shop_id` = 10379184
            AND `locations`.`country_id` = 128
            AND `orders`.`created_at` >= '2019-11-28 12:01:42'
            AND `orders`.`created_at` <= '2020-11-27 12:01:42'
    ) AS `orders` ON DATE(`orders`.`created_at`) = `dates`.`item`
GROUP BY
    `date`

UPDATE: Some have suggested using two left joins, however if I do that then the country_id filter is not applied:

WITH recursive `dates` AS (
    (
        SELECT
            '2019-11-28' AS item
    )
    UNION
    ALL (
        SELECT
            item + INTERVAL 1 DAY
        FROM
            `dates`
        WHERE
            item + INTERVAL 1 DAY <= '2020-11-27'
    )
)
SELECT
    DATE_FORMAT(`item`, '%b %y') AS `date`,
    COUNT(`orders`.`id`) AS `total`
FROM
    `dates`
    LEFT JOIN `orders` USE INDEX (`orders_created_at_index`) ON DATE(`created_at`) = `dates`.`item`
    AND `orders`.`shop_id` = 10379184
    AND `orders`.`created_at` >= '2019-11-28 12:22:43'
    AND `orders`.`created_at` <= '2020-11-27 12:22:43'
    LEFT JOIN `locations` ON `orders`.`location_id` = `locations`.`id`
    AND `locations`.`country_id` = 128
GROUP BY
    `date`

Thanks!

Advertisement

Answer

After much tinkering, I produced the following which operates in under 40ms, which is good enough for my needs. I still think it’s not ideal and would welcome any improvements…

SELECT
    `date`,
    COUNT(`order`)
FROM
    (
        WITH recursive `dates` AS (
            (
                SELECT
                    '2019-11-28' AS item
            )
            UNION
            ALL (
                SELECT
                    item + INTERVAL 1 DAY
                FROM
                    `dates`
                WHERE
                    item + INTERVAL 1 DAY <= '2020-11-27'
            )
        )
        SELECT
            DATE_FORMAT(`item`, '%b %y') AS `DATE`,
            `orders`.`id` AS `order`,
            `locations`.`id` AS `location`
        FROM
            `dates`
        LEFT JOIN 
            `orders` 
        ON 
            DATE(`created_at`) = `dates`.`item`
        AND 
            `orders`.`shop_id` = 10379184
        AND 
            `orders`.`created_at` >= '2019-11-28 12:22:43'
        AND 
            `orders`.`created_at` <= '2020-11-27 12:22:43'
        LEFT JOIN 
            `locations` 
        ON 
            `orders`.`location_id` = `locations`.`id`
        AND 
            `locations`.`country_id` = 209
    ) AS items
WHERE
    (
        `order` IS NULL
        AND `location` IS NULL
    )
    OR (
        `order` IS NOT NULL
        AND `location` IS NOT NULL
    )
GROUP BY
    `date`
User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement