I have a time series in a SQLite Database and want to analyze it.
The important part of the time series consists of a column with different but not unique string values. I want to do something like this:
x
Value concat countValue
A A 1
A A,A 1
B A,A,B 1
B A,B,B 2
B B,B,B 3
C B,B,C 1
B B,C,B 2
I don’t know how to get the countValue column. It should count all Values of the partition equal to the current rows Value.
I tried this but it just counts all Values in the partition and not the Values equal to this rows Value.
SELECT
Value,
group_concat(Value) OVER wind AS concat,
Sum(Case When Value Like Value Then 1 Else 0 End) OVER wind AS countValue
FROM TimeSeries
WINDOW
wind AS (ORDER BY date ROWS BETWEEN 2 PRECEDING AND CURRENT ROW)
ORDER BY
date
;
The query is also limited by these factors:
- The query should work with any amount of unique Values
- The query should work with any Partition Size (ROWS BETWEEN n PRECEDING AND CURRENT ROW)
Is this even possible using only SQL?
Advertisement
Answer
Here is an approach using string functions:
select
value,
group_concat(value) over wind as concat,
(
length(group_concat(value) over wind) - length(replace(group_concat(value) over wind, value, ''))
) / length(value) cnt_value
from timeseries
window wind as (order by date rows between 2 preceding and current row)
order by date;