Skip to content
Advertisement

How to fill missing values in aggregate-by-time function

I have function (from this question) which groups values by every 5 minutes and calculate min/avg/max:

SELECT (FLOOR(clock / 300) * 300) as period_start,
       MIN(value), AVG(value), MAX(value)
FROM data 
WHERE clock BETWEEN 1200000000 AND 1200001200
GROUP BY FLOOR(clock / 300);

However, due to missing values, some five-minute periods are skipped, making the timeline inconsistent. How to make it so that in the absence of data for a certain period, the value of max / avg / min becomes 0, instead of being skipped?

For example:

If I have timestamp – value

  • 1200000001 – 100
  • 1200000002 – 300
  • 1200000301 – 100
  • 1200000601 – 300
I want to get this: (select min/avg/max, time between 1200000000 and 1200001200)
  • 1200000000 – 100/200/300
  • 1200000300 – 100/100/100
  • 1200000600 – 300/300/300
  • 1200000900 – 0/0/0
Instead of this: (time between 1200000000 and 1200001200)
  • 1200000000 – 100/200/300
  • 1200000300 – 100/100/100
  • 1200000600 – 300/300/300
  • 1200000900 – THIS LINE WILL NOT BE, I will only get 3 lines above. No data between 1200000900 and 1200001200 for calculation.

My Answer:

Generate first table with required time range, and then left join this generated table on query with common group by operator. Such like this:

select * from

(select UNIX_TIMESTAMP(gen_date) as unix_date from
(select adddate('1970-01-01',t4*10000 + t3*1000 + t2*100 + t1*10 + t0) gen_date from
 (select 0 t0 union select 1 union select 2 union select 3 union select 4 union select 5 union select 6 union select 7 union select 8 union select 9) t0,
 (select 0 t1 union select 1 union select 2 union select 3 union select 4 union select 5 union select 6 union select 7 union select 8 union select 9) t1,
 (select 0 t2 union select 1 union select 2 union select 3 union select 4 union select 5 union select 6 union select 7 union select 8 union select 9) t2,
 (select 0 t3 union select 1 union select 2 union select 3 union select 4 union select 5 union select 6 union select 7 union select 8 union select 9) t3,
 (select 0 t4 union select 1 union select 2 union select 3 union select 4 union select 5 union select 6 union select 7 union select 8 union select 9) t4) v
where gen_date between '2017-01-01' and '2017-12-31') date_range_table

left join (
    SELECT (FLOOR(clock / 300) * 300) as period_start,
       MIN(value), AVG(value), MAX(value)
    FROM table
    WHERE clock BETWEEN 1483218000 AND 1514667600
    GROUP BY FLOOR(clock / 300)) data_table

on date_range_table.unix_date = data_table.period_start;

Advertisement

Answer

Alternative answer is generate first table with required time range, and then left join this generated table on query with common group by operator.

User contributions licensed under: CC BY-SA
10 People found this is helpful
Advertisement