Skip to content
Advertisement

What SQL query can be used to limit continious periods by parameter value, and then to calculate datediff inside them?

I have a table of phone calls consisting of user_id, call_date, city, where city can be either A or B. It looks like this:

user_id call_date city
1 2021-01-01 A
1 2021-01-02 B
1 2021-01-03 B
1 2021-01-05 B
1 2021-01-10 A
1 2021-01-12 B
1 2021-01-16 A
2 2021-01-17 A
2 2021-01-20 B
2 2021-01-22 B
2 2021-01-23 A
2 2021-01-24 B
2 2021-01-26 B
2 2021-01-30 A

For this table, we need to select for each user all the periods when he was in city B. These periods are counted in days and start when the first call is made from city B, and end as soon as the next call is made from city A. So for user_id = 1 fist period starts on 2021-01-02 and ands on 2021-01-10. There can be several such periods for each user.

The result should be the following table:

user_id period_1 period_2
1 8 4
2 3 6

Can you please tell me how I can limit the periods according to the condition of the problem, and then calculate the datediff within each period? Thank you

Advertisement

Answer

This is a typical gaps and islands problem. You need to group consecutive rows first, then find the first call_date of the next group. Sample code for Postgres is below, the same may be adapted to another DBMS by applying appropriate function to calculate the difference in days.

user_id | city | dt_from    | dt_to      | diff
:------ | :--- | :--------- | :--------- | ---:
1       | B    | 2021-01-02 | 2021-01-10 |    8
1       | B    | 2021-01-12 | 2021-01-16 |    4
2       | B    | 2021-01-20 | 2021-01-23 |    3
2       | B    | 2021-01-24 | 2021-01-30 |    6

db<>fiddle here

User contributions licensed under: CC BY-SA
4 People found this is helpful
Advertisement