Long time lurker/self helper where I have been able to crack things but it is either too late in the evening where I am or I am just missing something obvious. I have been trying to create an automated chart / query for the utilisation of my router.
I have a nested query that returns the following:
Record_Date | Mbps_IN | Mbps_OUT YYYYMMDD HH:00 | 1234 | 1234
this should have one entry per hour but due to data collection issues from my router there are often missing hours or even days of data missing. The nature of the counter is a “delta” so elsewhere in the “raw data” I am capturing the delta of data volume between the previous record which results in a flat line for a number of hours and then a very big data value often 2-3 times bigger due to it containing multiple hours of utilisation recorded against the first hour the data feed returned.
Ultimately I would like to find a way to smooth / build an average from this spike and backfill the missing hours. (but that is a challenge for another day)
In the first instance I would like simply only select the rows where the value in Mbps_In is less than 1000.
However, when I do this from either metabase or a dbeaver connection direct to my PrestoDB I get an error
Column 'results.Mbps_In' cannot be resolved {:message "line 27:7: Column 'results.Mbps_in' cannot be resolved", :errorCode 47, :errorName "COLUMN_NOT_FOUND",
My Query works just fine to give the tabular output including the outliers as follows
select metrics_date_hour Record_Date ,round(In_Utilisation_Mbps_Total,2) as Mbps_In ,round(Out_Utilisation_Mbps_Total,2) as Mbps_Out from ( nested query ) results -- WHERE results.Mbps_In < 1000 Group By Record_Date, Order By Record_Date desc
When I uncomment the Where clause I get the error on the failure to resolve the column name. I feel like this should not be difficult but I have tried a few variations and efforts at referencing some of the original columns that were processed earlier to get to this results output but I am still failing to correctly reference the column from the results table.
Any advice or pointers would be very gratefully received.
Updated with successful query:
select metrics_date_hour Record_Date ,round(sum(In_Utilisation_Mbps_Total),2) as Mbps_In ,round(sum(Out_Utilisation_Mbps_Total),2) as Mbps_Out from ( nested query ) results -- WHERE results.Mbps_In < 1000 - I didn't get this to work Group By Record_Date Having (sum(In_Utilisation_Mbps_Total) <1000 Order By Record_Date desc
Advertisement
Answer
The error is produced because you don’t have a column named Mbps_In in your nested query. I thing that you really need a HAVING clause not a WHERE. Try to change it to this:
select metrics_date_hour Record_Date ,round(In_Utilisation_Mbps_Total,2) as Mbps_In ,round(Out_Utilisation_Mbps_Total,2) as Mbps_Out from ( nested query ) results Group By Record_Date Having Mbps_In<1000 Order By Record_Date desc
If you still want too use the WHERE clause, you need to change your column name:
select metrics_date_hour Record_Date ,round(In_Utilisation_Mbps_Total,2) as Mbps_In ,round(Out_Utilisation_Mbps_Total,2) as Mbps_Out from ( nested query ) results Where In_Utilisation_Mbps_Total<1000 Group By Record_Date Order By Record_Date desc