I have a user table that store level of each category completion. User can unlock the second and the third category after a few levels of the first category completion.
My goal is to find which level (in first category) they unlocked the other categories.
Note: Data are not original!
Example:
| times | users | levels | mode | |----------------------------|-------|-----------|---------| | 2019-07-30 10:39:55.000000 | A | 1 | First | | 2019-07-30 10:43:16.000000 | A | 2 | First | | 2019-07-30 10:45:03.000000 | A | 3 | First | | 2019-07-30 10:47:20.000000 | A | 999 | Second | | 2019-07-30 10:49:50.000000 | A | 999 | Second | | 2019-07-30 20:21:39.000000 | B | 1 | First | | 2019-07-31 11:10:35.000000 | B | 2 | First | | 2019-07-31 11:11:51.000000 | B | 3 | First | | 2019-07-31 11:13:01.000000 | B | 4 | First | | 2019-07-31 11:15:11.000000 | B | 5 | First | | 2019-07-31 11:17:24.000000 | B | 999 | Third | | 2019-08-01 02:16:13.000000 | B | 999 | Second | | 2019-08-01 02:29:31.000000 | A | 4 | First | | 2019-08-01 08:04:01.000000 | A | 5 | First | | 2019-08-01 08:06:27.000000 | A | 999 | Third | | 2019-08-01 08:10:02.000000 | A | 1 | First | | 2019-08-01 08:12:29.000000 | A | 999 | Second | | 2019-08-02 04:45:43.000000 | A | 999 | Third | | 2019-08-02 07:42:35.000000 | C | 1 | First | | 2019-08-02 08:12:30.000000 | C | 2 | First | | 2019-08-02 08:15:53.000000 | C | 3 | First | | 2019-08-02 08:17:24.000000 | D | 1 | First |
So
User A unlocked the second category after complete the First category Level 3, unlocked the third category after Level 5 User B unlocked both second and third category after level 5 User C No any unlock category User D No any unlock category
Note: Only First category has levels other categories are fixed level (999)
I tried to use Lag()
over partition by and take the latest value of the first category level:
SELECT users, times, levels, mode, rnk, lag(levels, 1) OVER (PARTITION BY users ORDER BY times, mode) last_story_level FROM ( SELECT users times, CASE WHEN mode = 'First' THEN levels ELSE NULL END levels, mode, -- rnk will I use rnk=1 for the first value of each mode row_number() OVER (PARTITION BY mode, users ORDER BY times) rnk FROM my_table ORDER BY times )
and result is:
| time | user | level | mode | rnk| last_story_level| |----------------------------|-----------------|--- |-------|---|---| | 2019-07-30 10:39:55.000000 | A | 1 | First | 1 | NULL | | 2019-07-30 10:43:16.000000 | A | 2 | First | 2 | 1 | | 2019-07-30 10:45:03.000000 | A | 3 | First | 3 | 2 | | 2019-07-30 10:47:20.000000 | A | NULL | Second | 1 |3 | | 2019-07-30 10:49:50.000000 | A | NULL | Second | 2 |NULL| | 2019-07-30 20:21:39.000000 | B | 1 | First | 1 |NULL| | 2019-07-31 11:10:35.000000 | B | 2 | First | 2 |1 | | 2019-07-31 11:11:51.000000 | B | 3 | First | 3 |2 | | 2019-07-31 11:13:01.000000 | B | 4 | First | 4 |3 | | 2019-07-31 11:15:11.000000 | B | 5 | First | 5 |4 | | 2019-07-31 11:17:24.000000 | B | NULL | Third | 1 |5 | | 2019-08-01 02:16:13.000000 | B | NULL | Second | 1 |NULL| | 2019-08-01 02:29:31.000000 | A | 4 | First | 4 |NULL| | 2019-08-01 08:04:01.000000 | A | 5 | First | 5 |4 | | 2019-08-01 08:06:27.000000 | A | NULL | Third | 1 |5 | | 2019-08-01 08:10:02.000000 | A | 1 | First | 6 |NULL| | 2019-08-01 08:12:29.000000 | A | NULL | Second | 3 |1 | | 2019-08-02 04:45:43.000000 | A | NULL | Third | 2 |NULL| | 2019-08-02 07:42:35.000000 | C | 1 | First | 1 |NULL| | 2019-08-02 08:12:30.000000 | C | 2 | First | 2 |1 | | 2019-08-02 08:15:53.000000 | C | 3 | First | 3 |2 | | 2019-08-02 08:17:24.000000 | D | 1 | First | 1 |NULL|
The problem is when the user repeat lower level again the last value is not the highest level anymore,
So what I would like to this:
| time | user | last_story_level| mode | |----------------------------|-------|----------|-------- | | 2019-07-30 10:47:20.000000 | A | 3 | Second | | 2019-08-01 08:06:27.000000 | A | 5 | Third | | 2019-07-31 11:17:24.000000 | B | 5 | Third | | 2019-08-01 02:16:13.000000 | B | 5 | Second | | 2019-08-02 08:15:53.000000 | C | 3 | Not open any category | | 2019-08-02 08:17:24.000000 | D | 1 | Not open any category |
Advertisement
Answer
If I understand correctly, you want the previous highest value of level
for “First” for each user when they first go to the “next” mode.
You can use a cumulative max to get the previous level for “First” and then distinct on
to get only one row per user/mode:
select distinct on (user, mode) t.* from (select t.*, max(case when mode = 'First' then level end) over (partition by user order by time rows between unbounded preceding and current row ) as prev_first_level from my_table t ) t order by user, mode, time;
EDIT:
In Redshift, you can do:
select t.* from (select t.*, max(case when mode = 'First' then level end) over (partition by user order by time rows between unbounded preceding and current row ) as prev_first_level, row_number() over (partition by user, mode order by time) as seqnum from my_table t ) t where seqnum = 1;