In a Postgres DB, given the following simplified dataset of orders
:
ID STORE_ID UPDATED_AT
1 "store-1" 2021-01-01
2 "store-2" 2021-01-02
3 "store-3" 2021-01-03
4 "store-1" 2021-01-04
5 "store-2" 2021-01-05
6 "store-3" 2021-01-06
1000 "store-1" 2021-02-05
How can I write a query to select the the first 5 unique store id’s, and the first 10 orders per store ordered by oldest to newest, resulting in a maximum of 50 rows returned? For example:
ID STORE_ID UPDATED_AT
1 "store-1" 2021-01-01
4 "store-1" 2021-01-04
13 "store-1" 2021-01-12
2 "store-2" 2021-01-02
5 "store-2" 2021-01-05
18 "store-2" 2021-01-08
3 "store-3" 2021-01-03
8 "store-3" 2021-01-12
22 "store-3" 2021-01-22
My goal is to process orders from oldest to newest, but process the oldest 10 orders per store as I can batch them with the Shopify API which would be more efficient.
For example, in my code I will combine it to something like:
{
"store-1": [{ /* order */ }, { /* order */ }, { /* order */ }, ], // Array of 10 orders
"store-2": [{ /* order */ }, { /* order */ }, { /* order */ }, ], // Array of 10 orders
"store-3": [{ /* order */ }, { /* order */ }, { /* order */ }, ], // Array of 10 orders
"store-4": [{ /* order */ }, { /* order */ }, { /* order */ }, ], // Array of 10 orders
"store-5": [{ /* order */ }, { /* order */ }, { /* order */ }, ], // Array of 10 orders
}
So I can run 5 API calls in parallel for each store.
I’ve tried multiple queries, including the following:
SELECT a.store_id FROM orders a
INNER JOIN
(SELECT store_id FROM orders GROUP BY store_id ORDER BY MIN(updated_at) LIMIT 5) b
ON a.store_id = b.store_id
ORDER BY a.updated_at ASC;
But I cannot limit the rows to 10 per store ID
Advertisement
Answer
SELECT
o.*
FROM (
SELECT
store_id
FROM (
SELECT DISTINCT ON (store_id) -- 1a
store_id, updated_at
FROM orders
ORDER BY store_id, updated_at
) s
ORDER BY updated_at -- 1b
LIMIT 5
) s
CROSS JOIN LATERAL ( -- 2
SELECT
*
FROM orders o
WHERE o.store_id = s.store_id
ORDER BY updated_at
LIMIT 10
) o
1 a) Return only one record per store using DISTINCT ON
b) Identify the top 5 stores ordering by the date.
2. These stores can be used in a lateral join to filter the orders per store, again using the updated_at
order.