What I have: A single table with ids and prices for date and time.
What I want to do: To get the average difference of price for every id between the first and 15 minute of every day. Conditions: I also want to use the last available price in 15 minutes if the 15 is missing. For example if 15 is missing, use the price in 14 minutes.
What I did: My first step was to get the price after 15 minutes of every day for every id.
My code (I did a lot of tests with it, sorry if it doesn’t look good and optimised enough) to get the last price(but not last available) in the 15 minutes interval:
WITH s as (
SELECT DATE(date_value) as dt, date_value, stock_id sid, open FROM stocks_1min
)
SELECT sid, dt, date_value, s.open as second_price
FROM s
INNER JOIN (
SELECT q.stock_id, DATE(q.date_value) as dat
FROM stocks_1min q
GROUP BY q.stock_id, dat
ORDER BY q.stock_id ASC
) as end_price
ON s.sid = end_price.stock_id
AND s.date_value = (SELECT date_value FROM s WHERE s.sid = end_price.stock_id AND s.dt = end_price.dat LIMIT 1) + interval '15 minutes'
ORDER BY sid, dt;
What I didn’t do:
- get the last AVAILABLE price in that interval.
- get the first price in that interval and use it in the same query with the last price in the interval
- If I got both prices then I can use AVG and Group By to get the needed value.
Expected result (example): Table:
|stock_id|open(price)|date_value |
|1 |10 |01-12-2020 09:31|
|1 |12 |01-12-2020 09:46|
|1 |14 |02-12-2020 09:31|
|1 |20 |02-12-2020 09:46|
|2 |14 |01-12-2020 09:31|
|2 |16 |01-12-2020 09:46|
|2 |2 |02-12-2020 09:31|
|2 |16 |02-12-2020 09:46|
Result:
|stock_id|average_difference|
| 1 | 4 | ((12-10) + (20-14)) / 2 = 4
| 2 | 8 | ((16-14) + (16-2)) / 2 = 8
I have never used “Partition By”, but maybe here is the place to start with it? It’s my own project, where I want to test some statistical ideas for the stock market. Thanks in advance!
You can use this link where it’s a sample for testing: http://sqlfiddle.com/#!17/83597/1
Advertisement
Answer
If I have understood correctly, you want to:
- List the prices of each stock from each day during the first 15 minutes of the day
- Identify the first and last (as defined by date_value) of the prices listed in step 1
- Calculate the difference between the prices identified in step 2, by stock and day
- Calculate the average of the differences calculated in step 3, by stock
If I have interpreted your question correctly, then:
WITH dat
AS
(
SELECT stock_id
,date_value::date AS date_of_date_value
,date_value
,MIN(date_value) OVER (PARTITION BY stock_id, date_value::date) AS min_date_value
,MAX(date_value) OVER (PARTITION BY stock_id, date_value::date) AS max_date_value
,"open" AS open_price
FROM stocks_1min
WHERE date_value::time <= '09:45:00.000'
),
dat2
AS
(
SELECT stock_id
,date_of_date_value
,AVG(CASE WHEN min_date_value = date_value THEN open_price ELSE NULL END) AS open_price_from
,AVG(CASE WHEN max_date_value = date_value THEN open_price ELSE NULL END) AS open_price_to
,AVG(CASE WHEN max_date_value = date_value THEN open_price ELSE NULL END)
- AVG(CASE WHEN min_date_value = date_value THEN open_price ELSE NULL END) AS open_price_diff
FROM dat
WHERE date_value = min_date_value OR date_value = max_date_value
GROUP BY stock_id
,date_of_date_value
)
SELECT stock_id, AVG(open_price_diff) AS avg_open_price_diff
FROM dat2
GROUP BY stock_id;
You may find that you need to wrap the calculation of open_price_diff with ABS
, but I have not done that in the example above.