I have a table with the data in the following format:
id day duration 1 Monday 15 1 Monday 17 1 Tuesday 20 1 Monday 8 2 Monday 9 3 Wednesday 11 ...
It contains the number of seconds each user has watched tv on a specific day. One user can have more than one watching session per day. For example, user 1 has watched tv three times on Monday.
What I need to do is to create a separate column for each day, which will contain the total number of seconds watched on this day for each user. In other words, I need to have one row and seven columns for each user. The column would be: Monday
, Tuesday
, Wednesday
, etc.
It would look like this:
id Monday Tuesday Wednesday ... 1 40 20 0 2 9 0 0 3 0 0 11
Right now, I can do that using the following script:
SELECT df.id, monday, tuesday, wednesday, thursday, friday, saturday, sunday FROM ( SELECT id, SUM(duration) AS monday FROM data WHERE day = 'Monday' GROUP BY id ) AS df FULL JOIN ( SELECT id, SUM(duration) AS tuesday FROM data WHERE day = 'Tuesday' GROUP BY id ) AS df1 ON df.id = df1.id FULL JOIN ( SELECT id, SUM(duration) AS wednesday FROM data WHERE day = 'Wednesday' GROUP BY id ) AS df2 ON df.id = df2.id FULL JOIN ( SELECT id, SUM(duration) AS thursday FROM data WHERE day = 'Thursday' GROUP BY id ) AS df3 ON df.id = df3.id FULL JOIN ( SELECT id, SUM(duration) AS friday FROM data WHERE day = 'Friday' GROUP BY id ) AS df4 ON df.id = df4.id FULL JOIN ( SELECT id, SUM(duration) AS saturday FROM data WHERE day = 'Saturday' GROUP BY id ) AS df5 ON df.id = df5.id FULL JOIN ( SELECT id, SUM(duration) AS sunday FROM data WHERE day = 'Sunday' GROUP BY id ) AS df6 ON df.id = df6.id )
The script does its job, but I’m pretty sure it’s not the most efficient way to do what I want. And I’m looking for a way to make this script better and shorter. I know that there definitely is a much simpler solution, but I don’t even know how to ask a question correctly.
Could anyone help me please or point to a similar question? I’d really appreciate it!
Advertisement
Answer
Use aggregation:
select id, sum(case when day = 'Monday' then duration end) as monday, sum(case when day = 'Tuesday' then duration end) as tuesday, . . . from data group by id