I have a time series in a SQLite Database and want to analyze it.
The important part of the time series consists of a column with different but not unique string values. I want to do something like this:
Value concat countValue A A 1 A A,A 1 B A,A,B 1 B A,B,B 2 B B,B,B 3 C B,B,C 1 B B,C,B 2
I don’t know how to get the countValue column. It should count all Values of the partition equal to the current rows Value.
I tried this but it just counts all Values in the partition and not the Values equal to this rows Value.
SELECT Value, group_concat(Value) OVER wind AS concat, Sum(Case When Value Like Value Then 1 Else 0 End) OVER wind AS countValue FROM TimeSeries WINDOW wind AS (ORDER BY date ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) ORDER BY date ;
The query is also limited by these factors:
- The query should work with any amount of unique Values
- The query should work with any Partition Size (ROWS BETWEEN n PRECEDING AND CURRENT ROW)
Is this even possible using only SQL?
Advertisement
Answer
Here is an approach using string functions:
select value, group_concat(value) over wind as concat, ( length(group_concat(value) over wind) - length(replace(group_concat(value) over wind, value, '')) ) / length(value) cnt_value from timeseries window wind as (order by date rows between 2 preceding and current row) order by date;