I have legacy tables which tracks flight and had to extract data. We have three tables named booking, airlines and flighttype. Note this is a dummy samples
booking :
id | customer | request_date | airline | flightType | price | currency |
---|---|---|---|---|---|---|
1 | 1 | 11-20-2020 10:23 | 1 | 1 | 120 | Eur |
2 | 1 | 11-21-2020 10:24 | 1 | 2 | 110 | CHF |
3 | 2 | 11-01-2020 11:25 | 2 | 2 | 120 | Eur |
4 | 1 | 15-01-2020 10:23 | 1 | 1 | 100 | Eur |
5 | 1 | 11-01-2020 11:23 | 1 | 2 | 60 | Eur |
6 | 1 | 12-01-2020 10:23 | 1 | 3 | 35 | Eur |
airline :
id | airline |
---|---|
1 | French |
2 | Swiss |
type :
id | flightType |
---|---|
1 | domestic |
2 | international |
Now the data we are trying to figure out is number of bookings consecutively within x days (let say if two days it would mean how many bookings were made in 2 days) for various parameters like
- airline
- flightType
- airline & flightype
- currency
- price total price
For example lets say I wish to see what is the percentage of customer who have made multiple bookings within x days across multiple airline I should be able to do so or if I want to see the total revenue of customers who have made multiple booking within x days or customers who have made multiple booking with different set of currencies with x days
I am trying to make self join to extract such data and then group it but I am always reaching a dead end
SELECT t1.customer, t1.request_date, t1.airline, count(*) FROM booking t1 JOIN booking t2 ON t1.customer= t2.customer WHERE t2.request_date > t1.request_date and DATEDIFF(t2.request_date, t1.request_date) > 0 and DATEDIFF(t2.request_date, t1.request_date) <=2 GROUP BY t1.customer, t1.request_date
The problem I am facing is the date has time and it gives me wrong results. And I am not sure what is the right way to get %share of customers who make booking in such way like % share of customers who book more than one flight / more than one type of flight within span of x days.
Sorry if the question is too vague or if this violates any rules. By the way I am using mysql 5.5
Advertisement
Answer
I want to see the total revenue of customers who have made multiple booking within x days or customers who have made multiple booking with different set of currencies with x days
You can answer questions like this using window functions. For the first question, this looks like:
select count(distinct customer) from (select b.*, lag(request_date) over (partition by customer order by request_date) as prev_request_date from booking b ) b where request_date <= prev_request_date + interval <n> day;