I have the following dataset, that have the prices for many products and cities:
CREATE TABLE my_table ( the_id varchar(5) NOT NULL, the_date timestamp NOT NULL, the_city varchar(5) NOT NULL, the_product varchar(1) NOT NULL, the_price numeric(4,2) NOT NULL ); INSERT INTO my_table VALUES ('VIS01', '2019-05-02 09:00:00','LISBO','A',8), ('VIS02', '2019-05-04 12:00:00','EVORA','A',7), ('VIS03', '2019-05-05 18:00:00','LISBO','B',5), ('VIS04', '2019-05-06 18:30:00','PORTO','B',5), ('VIS05', '2019-05-15 12:05:00','PORTO','A',6), ('VIS06', '2019-05-18 18:06:00','EVORA','B',4), ('VIS07', '2019-05-23 18:07:00','PORTO','C',10), ('VIS08', '2019-05-26 18:08:00','EVORA','E',14), ('VIS09', '2019-06-07 18:09:00','LISBO','B',5), ('VIS10', '2019-06-09 18:10:00','LISBO','D',11), ('VIS11', '2019-06-12 18:11:00','EVORA','C',10), ('VIS12', '2019-06-15 18:12:00','LISBO','D',11), ('VIS13', '2019-06-15 18:13:00','EVORA','A',7), ('VIS14', '2019-06-18 18:14:00','PORTO','A',6), ('VIS15', '2019-06-23 18:15:00','LISBO','A',8), ('VIS16', '2019-06-25 18:16:00','LISBO','A',8), ('VIS17', '2019-06-27 18:17:00','PORTO','B',5), ('VIS18', '2019-06-27 18:18:00','LISBO','D',11), ('VIS19', '2019-06-28 18:19:00','LISBO','A',8), ('VIS20', '2019-06-30 18:20:00','EVORA','B',4), ('VIS21', '2019-07-01 18:21:00','EVORA','A',6), ('VIS22', '2019-07-04 18:30:00','EVORA','D',10), ('VIS23', '2019-07-04 18:31:00','EVORA','B',3), ('VIS24', '2019-07-06 18:40:00','EVORA','C',11), ('VIS25', '2019-07-12 18:50:00','EVORA','E',14), ('VIS26', '2019-07-15 18:00:00','PORTO','A',7), ('VIS27', '2019-07-18 18:00:00','PORTO','C',11), ('VIS28', '2019-07-25 18:00:00','PORTO','B',4), ('VIS29', '2019-07-30 18:00:00','PORTO','A',9), ('VIS30', '2019-07-30 18:00:00','LISBO','A',7);
What I want is to have the average price for each month and each city but it has to follow a rule, that it should have the combination of certain products. Let’s say that the product should be A and B strictly, then the expected table is:
month city product avg_price 2019-05 EVORA A 7 2019-05 EVORA B 4 2019-05 LISBO A 8 2019-05 LISBO B 5 2019-05 PORTO A 6 2019-05 PORTO B 5 2019-06 EVORA A 7 2019-06 EVORA B 4 2019-06 LISBO A 8 2019-06 LISBO B 5 2019-06 PORTO A 6 2019-06 PORTO B 5 2019-07 EVORA A 6 2019-07 EVORA B 3 2019-07 PORTO A 8 2019-07 PORTO B 4
As you see, in this case LISBO
doesn’t appear in 2019-07
because the condition “A and B” strictly doesn’t hold.
I have tried this code:
select date_trunc('month', the_date) as month, the_city, the_product, AVG(the_price) avg_price FROM my_table where the_product in ('A','B') group by month, the_city, the_product order by month desc
But I don’t know how to apply the condition. Please, could you help me with this question? Thank you in advance.
Advertisement
Answer
if you need in rows then you need to group by product as well:
select the_city, date_trunc('month', the_date) as yearmonth, the_product , avg(the_price) as a_avg from ( select * , sum(case when the_product = 'A' then 1 end) over (partition by date_trunc('month', the_date), the_city) acount , sum(case when the_product = 'B' then 1 end) over (partition by date_trunc('month', the_date), the_city) bcount from my_table where the_product in ('A','B') ) t where t.acount > 0 and t.bcount > 0 group by the_city, yearmonth,the_product order by yearmonth, the_city;