Skip to content
Advertisement

How do I make an aggregate on an integer with a grouped column, for which I only want some included?

I have a table prices holding all prices that some products have had:

CREATE TABLE prices (
  id INT,
  product_id INT, /*Foreign key*/
  created_at TIMESTAMP,
  price INT
);

The first entity for a product_id is it’s initial sales price. If the product is then reduced, a new entity will be added.

I would like to find the mean and total price change per day across all products.

This is some sample data:

INSERT INTO prices (id, product_id, created_at, price) VALUES (1, 1, '2020-01-01', 11000);
INSERT INTO prices (id, product_id, created_at, price) VALUES (2, 2, '2020-01-01', 3999);
INSERT INTO prices (id, product_id, created_at, price) VALUES (3, 3, '2020-01-01', 9999);
INSERT INTO prices (id, product_id, created_at, price) VALUES (4, 4, '2020-01-01', 2000);
INSERT INTO prices (id, product_id, created_at, price) VALUES (5, 1, '2020-01-02', 9999);
INSERT INTO prices (id, product_id, created_at, price) VALUES (6, 2, '2020-01-02', 2999);    
INSERT INTO prices (id, product_id, created_at, price) VALUES (7, 5, '2020-01-02', 2999);
INSERT INTO prices (id, product_id, created_at, price) VALUES (8, 1, '2020-01-03', 8999);
INSERT INTO prices (id, product_id, created_at, price) VALUES (9, 1, '2020-01-03 10:00:00', 7000);
INSERT INTO prices (id, product_id, created_at, price) VALUES (10, 5, '2020-01-03', 4000);
INSERT INTO prices (id, product_id, created_at, price) VALUES (11, 6, '2020-01-03', 3999);
INSERT INTO prices (id, product_id, created_at, price) VALUES (12, 3, '2020-01-03', 6999);

The expected result should be:

date       mean_price_change    total_price_change
2020-01-01 0                    0
2020-01-02 1000.5               2001
2020-01-03 1666                 4998

Explanation:

  • Mean price reduction and total on ‘2020-01-01’ was 0 as all products were new on that date.
  • On ‘2020-01-02’ however the mean price change was: (11000-9999 + 3999-2999)/2 = 1000.5 as both product_id 1 and 2 has been reduced to 9999 and 2999 on that day, and their previous prices were 11000 and 3999 and there total reduction would be: (11000-9999 + 3999-2999) = 2001.
  • On ‘2020-01-03’ only product_id 1, 3 and 5 were changed. 1 at two different times on the day: 9999 => 8999 => 7000 (last one governing) and 3: going from 9999 => 6999 a then 5: going up from 2999 => 4000. This gives a total of: (9999-7000 + 9999-6999 + 2999-4000) = 4998 and a mean price reduction on that day of: 1666

I have added the data here too: https://www.db-fiddle.com/f/tJgoKFMJxcyg5gLDZMEP77/1

I stated to play around with some DISTINCT ON but that does not seem to do it…

Advertisement

Answer

You seem to want lag() and aggregation:

select created_at, avg(prev_price - price), sum(prev_price - price)
from (select p.*, lag(price) over (partition by product_id order by created_at) as prev_price
      from prices p
     ) p
group by created_at
order by created_at;

You have two prices for product 1 on 2020-01-03. Once I fix that, I get the same results as in your question. Here is the db<>fiddle.

EDIT:

To handle multiple prices per day:

select created_at, avg(prev_price - price), sum(prev_price - price)
from (select p.*, lag(price) over (partition by product_id order by created_at) as prev_price
      from (select distinct on (product_id, created_at::date) p.*
            from prices p
            order by product_id, created_at::date
           ) p
     ) p
group by created_at
order by created_at;
User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement