Skip to content
Advertisement

In Postgresql, how do I use joins with multiple conditions including >= and <=

I have table A and table B. Each row in table A represents every time a user sends a message. Each row in table B represents every time a user buys a gift.

Goal: for each time a user sends a message, calculate how many gifts they’ve purchased within 7 days before the timestamp they sent the message. Some users never send messages and some never purchased gifts. If the user in table A didn’t have gift purchased within 7 days, the count should be 0.

Table A:

user_id time
12345 2021-09-04 09:43:55
12345 2021-09-03 00:39:30
12345 2021-09-02 03:26:07
12345 2021-09-05 15:48:34
23456 2021-09-09 09:06:22
23456 2021-09-08 08:06:21
00001 2021-09-03 15:38:15
00002 2021-09-03 15:38:15

Table B:

user_id time
12345 2021-09-01 09:43:55
12345 2021-08-03 00:42:30
12345 2021-09-03 02:16:07
00003 2021-09-05 15:48:34
23456 2021-09-03 09:06:22
23456 2021-09-10 08:06:21

Expected output:

user_id time count
12345 2021-09-04 09:43:55 2
12345 2021-09-03 00:39:30 1
12345 2021-09-02 03:26:07 1
12345 2021-09-05 15:48:34 2
23456 2021-09-09 09:06:22 1
23456 2021-09-08 08:06:21 1
00001 2021-09-03 15:38:15 0
00002 2021-09-03 15:38:15 0

Query I tried:

SELECT A.user_id, A.time, coalesce(count(*), 0) as count
FROM A
LEFT JOIN B ON A.user_id = B.user_id AND B.time >= A.time - INTERVAL '7 days' AND B.time < A.time
GROUP BY 1,2

The count returned doesn’t match the expected result however, not sure if I’m doing the join and conditions correctly.

Advertisement

Answer

You need to count the values from the possibly NULL columns i.e. from table B in order to get the correct counts of non-existent matches. i.e. being more specific in COUNT(*) to COUNT(b.column_from_b_table). See modification with working demo fiddle below:

SELECT 
    A.user_id, 
    A.time, 
    coalesce(count(B.user_id), 0) as count
FROM A
LEFT JOIN B ON A.user_id = B.user_id AND 
               B.time >= A.time - INTERVAL '7 days' AND 
               B.time < A.time
GROUP BY 1,2;
user_id time count
1 2021-09-03T15:38:15.000Z 0
12345 2021-09-05T15:48:34.000Z 2
23456 2021-09-08T08:06:21.000Z 1
12345 2021-09-04T09:43:55.000Z 2
12345 2021-09-03T00:39:30.000Z 1
23456 2021-09-09T09:06:22.000Z 1
2 2021-09-03T15:38:15.000Z 0
12345 2021-09-02T03:26:07.000Z 1

View on DB Fiddle

Let me know if this works for you.

User contributions licensed under: CC BY-SA
4 People found this is helpful
Advertisement