I have a table with data like the tableA example below. the date column is formatted as a string.
the close column is an integer, ticker is formatted as string.
I’m trying to run the query below on a mysql database and it is taking a very long time.
is there anything I can do to speed this up, like changing the format of the date column, or adding
indices or primary keys? The combination of ticker and date should be a unique value, and the date field is a timestamp, it’s just currently formatted as string.
code:
select avg((a.close-b.close)/b.close) as avg_annual_returns, a.ticker from tableA a join tableA b on cast(a.date as date)=date_add(cast(b.date as date),interval 365 DAY) and a.ticker=b.ticker where b.close is not null group by a.ticker
tableA
+--------+-----+------+ |date |close|ticker| +--------+-----+------+ |2/1/2019|5 |abc | +--------+-----+------+ |2/3/2019|7 |efd | +--------+-----+------+ |2/4/2019|3 |hij | +--------+-----+------+
update answer:
select ticker,date, ( -1 + a.close / max(a.close) over (partition by ticker order by date range between interval 365 day preceding and interval 365 day preceding ) ) as annual_returns from tableA a ) b where annual_returns is not null group by ticker
Advertisement
Answer
If you want the difference from a year ago, then use window functions. Before that, though, fix the data model! Do not store dates as strings. So:
alter table talbeA modify column date date;
Then to get the close from a year ago:
select( -1 + a.close / max(a.close) over (partition by ticker order by date range between interval 365 day preceding and interval 365 day preceding ) ) from tablea a;
You don’t have to worry about NULL
values because AVG()
ignores them.
Here is a db<>fiddle.