I have table A and table B. Each row in table A represents every time a user sends a message. Each row in table B represents every time a user buys a gift.
Goal: for each time a user sends a message, calculate how many gifts they’ve purchased within 7 days before the timestamp they sent the message. Some users never send messages and some never purchased gifts. If the user in table A didn’t have gift purchased within 7 days, the count should be 0.
Table A:
user_id | time |
---|---|
12345 | 2021-09-04 09:43:55 |
12345 | 2021-09-03 00:39:30 |
12345 | 2021-09-02 03:26:07 |
12345 | 2021-09-05 15:48:34 |
23456 | 2021-09-09 09:06:22 |
23456 | 2021-09-08 08:06:21 |
00001 | 2021-09-03 15:38:15 |
00002 | 2021-09-03 15:38:15 |
Table B:
user_id | time |
---|---|
12345 | 2021-09-01 09:43:55 |
12345 | 2021-08-03 00:42:30 |
12345 | 2021-09-03 02:16:07 |
00003 | 2021-09-05 15:48:34 |
23456 | 2021-09-03 09:06:22 |
23456 | 2021-09-10 08:06:21 |
Expected output:
user_id | time | count |
---|---|---|
12345 | 2021-09-04 09:43:55 | 2 |
12345 | 2021-09-03 00:39:30 | 1 |
12345 | 2021-09-02 03:26:07 | 1 |
12345 | 2021-09-05 15:48:34 | 2 |
23456 | 2021-09-09 09:06:22 | 1 |
23456 | 2021-09-08 08:06:21 | 1 |
00001 | 2021-09-03 15:38:15 | 0 |
00002 | 2021-09-03 15:38:15 | 0 |
Query I tried:
SELECT A.user_id, A.time, coalesce(count(*), 0) as count FROM A LEFT JOIN B ON A.user_id = B.user_id AND B.time >= A.time - INTERVAL '7 days' AND B.time < A.time GROUP BY 1,2
The count returned doesn’t match the expected result however, not sure if I’m doing the join and conditions correctly.
Advertisement
Answer
You need to count the values from the possibly NULL columns i.e. from table B in order to get the correct counts of non-existent matches. i.e. being more specific in COUNT(*)
to COUNT(b.column_from_b_table)
. See modification with working demo fiddle below:
SELECT A.user_id, A.time, coalesce(count(B.user_id), 0) as count FROM A LEFT JOIN B ON A.user_id = B.user_id AND B.time >= A.time - INTERVAL '7 days' AND B.time < A.time GROUP BY 1,2;
user_id | time | count |
---|---|---|
1 | 2021-09-03T15:38:15.000Z | 0 |
12345 | 2021-09-05T15:48:34.000Z | 2 |
23456 | 2021-09-08T08:06:21.000Z | 1 |
12345 | 2021-09-04T09:43:55.000Z | 2 |
12345 | 2021-09-03T00:39:30.000Z | 1 |
23456 | 2021-09-09T09:06:22.000Z | 1 |
2 | 2021-09-03T15:38:15.000Z | 0 |
12345 | 2021-09-02T03:26:07.000Z | 1 |
Let me know if this works for you.