Skip to content
Advertisement

Getting wrong sum when using group by with inner join

With reference to this question (How to get the sum in a joined table when using group by – getting wrong results) I have two tables orders and order_items. I need to group the results by days. But I also need to get the sum of energy_used for each day from another table. When I try that using a join, I get wrong order_sum for each day (they are not being summed up). Not sure what I am doing wrong.

I would like to get for each day

  • the sum of order_items.energy_used for all orders created that day
  • the sum of orders.order_sum for all orders created that day
  • the created_at and order_sum that correspond to the latest order created on that day

Here is my orders table

+----+-----------+---------+---------------------+
| id | order_sum | user_id | created_at          |
+----+-----------+---------+---------------------+
| 1  | 25.13     | 7       | 2020-01-25 09:13:00 |
| 2  | 10.00     | 7       | 2020-01-25 15:23:00 |
| 3  | 14.00     | 5       | 2020-01-26 10:14:00 |
| 4  | 35.00     | 1       | 2020-01-27 11:13:00 |
+----+-----------+---------+---------------------+

And here is my order_items table

+----+----------+-------------+---------------------+
| id | order_id | energy_used | created_at          |
+----+----------+-------------+---------------------+
| 1  | 1        | 65          | 2020-01-25 09:13:00 |
| 2  | 1        | 12          | 2020-01-25 09:13:00 |
| 3  | 2        | 70          | 2020-01-26 10:14:00 |
| 4  | 2        | 5           | 2020-01-26 10:14:00 |
| 5  | 3        | 0           | 2020-01-27 11:13:00 |
+----+----------+-------------+---------------------+

And this is the desired result that I am trying to achieve

+---------------+-----------------+-------------------+---------------------+----------------+
| date_of_month | total_order_sum | total_energy_used | last_order_date     | last_order_sum |
+---------------+-----------------+-------------------+---------------------+----------------+
| 2020-01-25    | 35.13           | 77                | 2020-01-25 09:13:00 | 25.13          |
| 2020-01-26    | 14.00           | 75                | 2020-01-26 10:14:00 | 14.00          |
| 2020-01-27    | 35.00           | 0                 | 2020-01-27 11:13:00 | 35.00          |
+---------------+-----------------+-------------------+---------------------+----------------+

And here is the query that I have tried but I’m getting wrong results, the order_sum is not being calculated correctly. It is showing the same as last_order_sum

select 
    date(o.created_at) date_of_month,
    i.total_energy_used,
    o.created_at last_order_date,
    o.order_sum last_order_sum,
    sum(order_sum) as total_order_sum
from orders o
inner join (
    select date(o1.created_at) date_of_month, sum(i1.energy_used) total_energy_used
    from orders o1
    inner join order_items i1 on o1.id = i1.order_id
    group by date(o1.created_at)
) i on i.date_of_month = date(o.created_at)
where o.created_at = (
    select max(o1.created_at)
    from orders o1
    where date(o1.created_at) = date(o.created_at)
)

Here is a fiddle: https://dbfiddle.uk/?rdbms=mysql_5.6&fiddle=92b8cc2920ad9f7a7cdd56bded5a3bf2

Advertisement

Answer

Always join tables together on their relationships (in this case orders.id with order_items.order_id) and then group. to avoid duplicating order_sums for multiple order_items when joining, first group order_items by order_id.

select 
    date(o.created_at) date_of_month,
    sum(i.total_energy_used),
    max(o.created_at),
    sum(order_sum) as total_order_sum
from orders o
inner join (
    select order_id, sum(total_energy_used) total_energy_used
    from order_items i
    group by order_id
) i on o.id = i.order_id
group by date(o.created_at)

from this point onwards you can do a join again on orders with max(o.created_at) to get the order_sum of the last order. moral of the story: keep an eye on your granularity.

User contributions licensed under: CC BY-SA
1 People found this is helpful
Advertisement