I have a table like:
+------+------+-------+ | Name | Date | Price | +------+------+-------+ | A | d1 | 5 | | A | d2 | 5.5 | | A | d3 | 5.75 | | A | d4 | 5.65 | | A | d5. | 5.61 | | A | d6 | 5.7 | | B | d1 | 12 | | B | d2 | 11.8 | | B | d3 | 11.9 | | B | d4 | 11.95 | | B | d5 | 11.96 | | B | d6 | 11.98 | +------+------+-------+
I want this output:
+------+------------+ | Name | Array | +------+------------+ | A | [2, -2, 1] | | B | [-1, 4] | +------+------------+
Basically the array has the count of consecutive increments/decrements. The table has around 10 million rows. Right now doing via a script which makes it very slow. Can it be done through SQL.
I am using Postgres 12.4
Advertisement
Answer
This is a type of gaps-and-islands problem
with tt as ( select t.*, (price > lag(price) over (partition by name order by date)) as is_inc from t ) select name, array_agg( case when is_inc then cnt else - cnt end order by min_date) as result from (select name, is_inc, count(*) as cnt, min(date) as min_date from (select t.*, row_number() over (partition by name order by date) as seqnum, row_number() over (partition by name, is_inc order by date) as seqnum_2 from tt where prev_price is not null ) t group by name, is_inc, seqnum - seqnum_2 ) n group by name;
Here is a db<>fiddle.