Skip to content
Advertisement

How to create a complex aggregate function in sqlite

Suppose I have the following table where there are two sets of observation dates (2015-01-01, 2016-01-01) in the first column. For each observation date, there are associated item_date and item_value.

observation_date item_date item_value
2015-01-01 2012-12-31 0
2015-01-01 2013-03-31 1
2015-01-01 2013-06-30 2
2015-01-01 2013-09-30 3
2015-01-01 2013-12-31 4
2015-01-01 2014-03-31 5
2015-01-01 2014-06-30 6
2015-01-01 2014-09-30 7
2016-01-01 2013-09-30 8
2016-01-01 2013-12-31 9
2016-01-01 2014-03-31 10
2016-01-01 2014-06-30 11
2016-01-01 2014-09-30 12
2016-01-01 2014-12-31 13
2016-01-01 2015-03-31 14
2016-01-01 2015-06-30 15
2016-01-01 2015-09-30 16

If I were to group by observation_date and get max(item_date), I would get the following result.

observation_date item_date item_value
2015-01-01 2014-09-30 7
2016-01-01 2015-09-30 16

Now instead of the max aggregate function, I’d like to create my own function. My goal is as follows: For each observation date, I would like to return the row where the item_date is n years before the max(item_date). For example, if n=1, I would get the rows where the item_date is 1 year ago from the max(item_date).

observation_date item_date item_value
2015-01-01 2013-09-30 3
2016-01-01 2014-09-30 12

Note that I simply do not want just the dates, but the whole row. I’ve been looking at sqlite’s create_aggregate functionality but I don’t know how to return the full row. Any ideas how I can do this?

If there are more efficient ways to achieve this, please let me know.

Advertisement

Answer

You need a correlated subquery that returns for each observation_date the max date minus ? years:

sql = """
SELECT t1.*
FROM tablename t1
WHERE t1.item_date = (
  SELECT DATE(MAX(t2.item_date), '-' || ? || ' year') 
  FROM tablename t2 
  WHERE t2.observation_date = t1.observation_date
);
"""
cursor = conn.cursor()
cursor.execute(sql, ("1",))

See the demo.

3 People found this is helpful
Advertisement