Skip to content
Advertisement

SQL subquery in SELECT clause

I’m trying to find admin activity within the last 30 days.

The accounts table stores the user data (username, password, etc.)

At the end of each day, if a user had logged in, it will create a new entry in the player_history table with their updated data. This is so we can track progress over time.

accounts table:

id username admin
1 Michael 4
2 Steve 3
3 Louise 3
4 Joe 0
5 Amy 1

player_history table:

id user_id created_at playtime
0 1 2021-04-03 10
1 2 2021-04-04 10
2 3 2021-04-05 15
3 4 2021-04-10 20
4 5 2021-04-11 20
5 1 2021-05-12 40
6 2 2021-05-13 55
7 3 2021-05-17 65
8 4 2021-05-19 75
9 5 2021-05-23 30
10 1 2021-06-01 60
11 2 2021-06-02 65
12 3 2021-06-02 67
13 4 2021-06-03 90

The following query

SELECT a.`username`, SEC_TO_TIME((MAX(h.`playtime`) - MIN(h.`playtime`))*60) as 'time' FROM `player_history` h, `accounts` a WHERE h.`created_at` > '2021-05-06' AND h.`user_id` = a.`id` AND a.`admin` > 0 GROUP BY h.`user_id`

Outputs this table:

Note that this is just admin activity, so Joe is not included in this data.

from 2021-05-06 to present (yy-mm-dd):

username time
Michael 00:20:00
Steve 00:10:00
Louise 00:02:00
Amy 00:00:00

As you can see this from data, Amy’s time is shown as 0 although she has played for 10 minutes in the last month. This is because she only has 1 entry starting from 2021-05-06 so there is no data to compare to. It is 0 because 10-10 = 0.

Another flaw is that it doesn’t include all activity in the last month, basically only subtracts the highest value from the lowest.

So I tried fixing this by comparing the highest value after 2021-05-06 to their most previous login before the date. So I modified the query a bit:

SELECT a.`Username`, SEC_TO_TIME((MAX(h.`playtime`) - (SELECT MAX(`playtime`) FROM `player_history` WHERE a.`id` = `user_id` AND `created_at` < '2021-05-06'))*60) as 'Time' FROM `player_history` h, `accounts` a WHERE h.`created_at` >= '2021-05-06' AND h.`user_id` = a.`id` AND a.`admin` > 0 GROUP BY h.`user_id`

So now it will output:

username time
Michael 00:50:00
Steve 00:50:00
Louise 00:52:00
Amy 00:10:00

But I feel like this whole query is quite inefficient. Is there a better way to do this?

Advertisement

Answer

I think you want lag():

SELECT a.username,
       SEC_TO_TIME(SUM(h.playtime - COALESCE(h.prev_playtime, 0))) as time
FROM accounts a JOIN
     (SELECT h.*,
             LAG(playtime) OVER (PARTITION BY u.user_id ORDER BY h.created_at) as prev_playtime
      FROM player_history h
     ) h
     ON h.user_id = a.id
WHERE h.created_at > '2021-05-06' AND
      a.admin > 0
GROUP BY a.username;

In addition to the LAG() logic, note the other changes to the query:

  • The use of proper, explicit, standard, readable JOIN syntax.
  • The use of consistent columns for the SELECT and GROUP BY.
  • The removal of single quotes around the column alias.
  • The removal of backticks; they just clutter the query, making it harder to write and to read.
User contributions licensed under: CC BY-SA
2 People found this is helpful
Advertisement