Skip to content
Advertisement

Returning median values of time deltas across different groups

Trying to calculate the range between different steps in my data-table and return the median for each calculation using this SQL code:

SELECT median(datediff(seconds,one,two)) as step_one,
       median(datediff(seconds,two,three)) as step_two,
FROM Table

This returns the following error message:

[0A000][500310] Amazon Invalid operation: within group ORDER BY clauses for aggregate functions must be the same; java.lang.RuntimeException: com.amazon.support.exceptions.ErrorException: Amazon Invalid operation: within group ORDER BY clauses for aggregate functions must be the same;

Note: I can however return one median value.

Here is a sample of my dataframe:

one                                 two                        three    
2015-12-14 19:01:58.014247  2015-12-21 17:36:06.187302  2015-12-14 19:10:00.040057  2015-12-14 19:03:18.153519
2016-01-02 05:18:50.351975  2016-01-02 05:26:10.660299  2016-01-02 05:22:58.353365  2016-01-02 05:19:34.915794
2016-02-08 07:29:23.938046  2016-02-08 07:41:42.016819  2016-02-08 07:31:23.899776  2016-02-08 07:30:03.168844
2016-02-25 18:25:39.223014  2016-02-25 18:31:07.087808  2016-02-25 18:29:02.490969  2016-02-25 18:26:20.188472
2015-11-26 12:02:27.033141  2015-11-26 12:07:52.813699  2015-11-26 12:06:33.106484  2015-11-26 12:03:09.152853

2015-12-18 08:44:13.184319  2015-12-18 13:10:51.707354  2015-12-18 13:09:35.938711  2015-12-18 13:02:22.650966
2016-01-31 06:41:55.165849  2016-01-31 06:44:58.004319  2016-01-31 06:43:25.923505  2016-01-31 06:42:29.955232
2016-02-15 12:22:29.051259  2016-02-22 09:29:15.649721  2016-02-22 08:40:45.221558  2016-02-16 06:52:52.368139

The desired result is the median time delta between one and two and two and three (there are more columns in the actual data)

Advertisement

Answer

If a statement includes multiple calls to sort-based aggregate functions (LISTAGG, PERCENTILE_CONT, or MEDIAN), they must all use the same ORDER BY values. Note that MEDIAN applies an implicit order by on the expression value.

From https://docs.aws.amazon.com/redshift/latest/dg/r_PERCENTILE_CONT.html

User contributions licensed under: CC BY-SA
4 People found this is helpful
Advertisement