Skip to content
Advertisement

marked user with label by time for each month

Data source

User ID Visit Date
1 2020-01-01 12:29:15
1 2020-01-02 12:30:11
1 2020-04-01 12:31:01
2 2020-05-01 12:31:14

Problem

I need advice im trying to do sub query for this result to mark user as retention if they havent visit back for 3 month. i using this query for the data to get user’s latest visit each month includes null

but i think its really affect to performance if user keep increasing, do you guys have any advice to achieve result like below?

Expected Result

Month User ID Type
1 1 FIRST
2 1 RETENTION
3 1 RETENTION
4 1 REACTIVATE
….
12 1 null
1 2 null
5 2 FIRST
6 2 RETENTION
7 2 RETENTION
8 2 RETENTION
9 2 null
… and so on

or it could be like this

Month First Retention Reactiavate
1 1 0 0
2 0 1 0
3 0 1 0
4 0 0 1
5 1 0 0
6 0 1 0
7 0 1 0
8 0 1 0
9 0 0 0
… and so on

Advertisement

Answer

My solution have some reasonable requirements, but I can work without it (the price is some performance).

I built some helper tables what is autofilled with trigger. The requirement is, that UPDATE or DELETE isn’t allowed on visit table.

mst_user table stores distinct user_id-s and it’s first_visit. user_monthly_visit table stores the last and the first visit_date pro user_id and month.

TABLES

TRIGGER

TESTDATA

QUERY

RESULT

month user_id user_type
1 1 FIRST
2 1 RETENTION
3 1 RETENTION
4 1 REACTIVATE
5 1 RETENTION
6 1 RETENTION
7 1 RETENTION
8 1 (null)
9 1 (null)
10 1 (null)
11 1 (null)
12 1 (null)
1 2 (null)
2 2 (null)
3 2 (null)
4 2 (null)
5 2 FIRST
6 2 RETENTION
7 2 RETENTION
8 2 RETENTION
9 2 (null)
10 2 (null)
11 2 (null)
12 2 (null)
User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement