I have the query below that assigns a station volume based on the volume processed that week. Station sizes are based by looking at 4 data points throughout the year. This is because station fluctuate in volume throughout the year for example, they may have small volume in one quarter, but large in another. The goal is to assign a size to a station that will best suite it the entire year. The sql below gets together the four data points that will be needed to evaluate what size a station_size will be applied. There are multiple combinations of volume possibilities that can take place over the year which is why they are stored in a separate table. I have a separate table called “station_sizes” that provides the logic for what the station size will be based on the volume taken from the four point in time.
For example if a station processes small volume in wk 14 and small volume in wk27 and standard volume in wk 40 and small volume in wk 47 then they would be a ds small station size. There are 1000s of scenarios in the station_size table.
I want the final output to be the excerpt from the query output + station_size. The station size will be pulled from the station_size table.
I need to know how to have the query below reference the station size table so that it can identify for each year group is wk_16= xyz.. and wk_26 =xyz.. and wk_39 = xyz.. and wk_43 =xyz then it will pull in the size logic form the station size table.
For the ar section in the query. I want the logic to be if a station has “Yes” for AR at a minimum it needs to be a standard station size. I am not sure if I did that peice right. I am fine if its bigger than that but no smaller.
SELECT station, del_date, sum(volume)/7 as volume_ref,DATE_PART("week",del_date) as week_num,DATE_PART("year",del_date) as year, CASE when sum(volume)/7 between 0 and 20000 then 'ds x-small' when sum(volume)/7 between 20000 and 36000 and max(ar) <> 'yes' then 'ds small' when sum(volume)/7 between 36000 and 42000 then 'ds standard' when sum(volume)/7 between 42000 and 72000 then 'ds large' when sum(volume)/7 > 72000 then 'ds x-large' when max(ar) = 'YES' then 'ds standard' else 'ds small' end as station_ref FROM prophecy_na.na_topology_lrp Right JOIN wbr_global.raw_station_extended_attribute ON prophecy_na.na_topology_lrp.station = wbr_global.raw_station_extended_attribute.ds where week_num in(16,26,39,43) GROUP BY station, del_date;
[![enter image description here][1]][1] [![enter image description here][2]][2]
Advertisement
Answer
It appears you are short on some clarification and I’ll get into that in a moment. You may want to consider creating an additional lookup table for your query such as StationSize.
StationSizeID Description MinVolume ButLessThan
Then you could adjust your ranges without adjusting queries (in case there are others that use the same threshold values. You could have values such as
StationSizeID Description MinVolume ButLessThan 1 ds x-small 0 20,000 2 ds small 20,000 36,000 3 ds standard 36,000 42,000 4 ds large 42,000 72,000 5 ds x-large 72,000 9,999,999
We’ll get to a join later.
You also have a test for “max(ar)” in your case being applied LAST, but also under your ds-small test condition. We have no idea where this “ar” field comes from, nor its context of possible values. If string based and you have one record with value of “yes”, then it would supersede any value of “no”, even if the “no” was the latest value for some given origin. Also, by considering the volume ranges of the case/when, and the sum of volume for any given week is 18,000, but it has a “yes” for the max(ar), it would never get to your case test pushing it to the “ds standard” level. Similarly if you had a 75k volume and a max(ar) = “yes”, it too would remain ds x-large and not the ds standard. Is this the intent? Should the “yes” condition be tested up front, so if EVER a “yes”, it is always considered a standard station? If so, that condition needs to change.
Now, my next level needing clarification. You mention 4 weeks over the year but user has no context of what / why the given weeks in question being checked for. Are you looking for volume activity if a given station is growing / declining based on volume activity? If so, are you trying to provide some weight factor to upgrade / downgrade the station size? per your example: originally small, standard, small, small is considered small overall. What if the scenario was: small, small, standard, standard? Would it be upgraded to a standard size station due to its growth?
So, instead of just people throwing queries at you, understanding what you have, what you are looking for, only the proper context can really get you the answer you are probably looking for. I would suggest you edit your existing post, provide clarification within the details of that question instead of direct comments and save. THEN, you can comment me back to review the changes you posted and I can proceed with more.
REVISION / FEEDBACK
Ok, I THINK I am following you from your comments. Your second table that has 1000s of combinations is the basis to compare the volume per the given weeks, and your table is EXPLICITLY referencing the 4 weeks of 14, 27, 40 and 47. But the query you are performing is actually volume of activity per station on different weeks as per your query looking for the date weeks within 16,26,39,43. So, the column headers really dont mean anything of Wk14, Wk27, Wk40, Wk47. They just represent the 4 weeks you WANT to query based on, such as you could query for weeks 2, 11, 24, 31? In which case the status of whatever the first week is, correlates to the matched description of “wk14” in your station size table and similarly for 11 => Wk27, week 24 => Wk40 and lastly week 31 => Wk47. This Station Size table has 1000’s of combinations to take into consideration all (or most) permutations of what values COULD be in the respective weekly positions.
But, you do not have a station size table as I simulated in the first example showing the low / high range, correct? Its ok if not, just trying to look at all the pieces on the playing field.
Am I getting CLOSER to what you are trying to accomplish?
POSSIBLE SOLUTION PLEASE NOTE… I am not familiar with “dbeaver” database, but bet it is MOSTLY common sql compliant. If anything does NOT work, let me know and we can adjust.
Ok, I will try to do this in pieces so you can see how all tie together. Lets start by doing a query per week grouped
select lrp.station, DATE_PART("week", lrp.del_date) as WkNum, sum( lrp.volume ) / 7 as WkAvg, -- since the date part of the group by will always be the same -- correlating to the 1st, 2nd, 3rd or 4th week, I now have this -- extra column and just apply MAX(), otherwise SQL will NAG if -- it is not part of group by and its not an aggregate function max( case when DATE_PART("week", lrp.del_date) = 16 then 1 when DATE_PART("week", lrp.del_date) = 26 then 2 when DATE_PART("week", lrp.del_date) = 39 then 3 else 4 end ) as WeekOrder from prophecy_na.na_topology_lrp lrp where -- EXAMPLE, if you KNOW you only care about within a current year -- this will help optimize by pre-qualifying the single year you -- expect the week activity within lrp.del_Date >= '2021-01-01' AND lrp.del_date < '2022-01-01' -- and NOW, only filter out for the 4 specific weeks you are about AND DATE_PART("week", lrp.del_date) in (16,26,39,43) group by lrp.station, DATE_PART("week", lrp.del_date)
Now, if you wanted weeks that scroll over a year time, such as from October of 2020 to August of 2021, just put the starting / ending dates to filter that out so your week “43” would be representative of Year 2020 and week 3 would be Jan 2021, and a mental map on the weeks you want might be though of as (week 43 = October of 2020, week 2 = Jan of 2021, week 15 = April 2021, etc.), so your weeks might look like (43, 2, 15, 21). So do you see the rolling weeks I THINK you are trying to get covered?
If so, the above query might come up with results like below.
Station WkNum WkAvg WeekOrder x 16 34142 1 x 26 36879 2 x 39 35387 3 x 43 37114 4 y 16 43872 1 y 26 41983 2 y 39 42218 3 y 43 39461 4 z 16 ...
So now you have one master list of records, per station and the proper corresponding week you intend them per the query, and if you did the year roll-over, just change the case condition for proper week order context for week 1-4.
Now, we can run THIS against your case for the ranges to pull the descriptors. So I will use the entire query above as a “Prequery” (pq alias) and build out more. But NOW, I will apply a cross-tab to pull in the 4 possible weeks into a single row that will more directly match your Station_Size table.
select pq.Station, max( case when pq.WeekOrder = 1 then VolRanges.StationSize else '' end ) as Wk1Size, max( case when pq.WeekOrder = 2 then VolRanges.StationSize else '' end ) as Wk2Size, max( case when pq.WeekOrder = 3 then VolRanges.StationSize else '' end ) as Wk3Size, max( case when pq.WeekOrder = 4 then VolRanges.StationSize else '' end ) as Wk4Size, -- just to tack on the "AR" status to extended properties max( case when xa.station_id is null then 'no' else 'yes' end ) as ARStatus from ( entire first pre query above ) pq JOIN ( select 'ds x-small' stationSize, 0 AtLeast, 20000 ButLessThan UNION select 'ds small', 20000, 36000 UNION select 'ds standard', 36000, 42000 UNION select 'ds large', 42000, 72000 UNION select 'ds x-large', 72000, 9999999 ) VolRanges on pq.WkAvg / 7 >= VolRanges.AtLeast AND pq.WkAvg / 7 < VolRanges.ButLessThan -- and also now getting the "ar" value from the extended properties -- doing a LEFT-JOIN in case no such extended properties so the given -- station is not left out of the query. So here, I am EXPLICITLY -- pulling ONLY IF the ar = 'yes' LEFT JOIN wbr_global.raw_station_extended_attribute xa on pq.station_id = xa.ds AND xa.ar = 'yes' group by pq.Station
So now, we still have one row per station AND week, but now tacked-on the per-week basis description
Station Wk1Size Wk2Size Wk3Size Wk4Size ARStatus x ds small ds standard ds small ds small no y ds large ds standard ds large ds large no z … DII1 (sample from properties, assume aggregates) YES
Now, we need to get each individual week to its corresponding 4-part week table, So this is the next nesting and I will tie them to the station size table that has the 4 parts to match up with.
select perStation.*, ss.station_size from ( entire SECOND query above ) perStation JOIN Station_Size ss on perStation.Wk1Size = ss.wk_14 AND perStation.Wk2Size = ss.wk_27 AND perStation.Wk3Size = ss.wk_40 AND perStation.Wk4Size = ss.wk_47
So, the above will get all EXCEPT the final “AR” consideration. So lets look at the “AR” = yes consideration. You have 5 station size types. If anything is HIGHER than standard, that wins. So, the ONLY time you want to UPGRADE someone is if the 4wk combination final station size = “ds small” AND the AR = “YES”. So lets just add that, all else, stay as you are. Ie: If the station is NO for the ar status and they are a size “ds small”, that is the ONLY time it stays small.
I would just change the above to add the extra FinalStationSize consideration.
select perStation.*, ss.station_size, case when perStation.ARStatus = 'yes' AND ss.Station_Size = 'ds small' then 'ds standard' else ss.Station_Size end FinalStationSize
You may get SQL errors that may not be compatible with dbeaver database, that is why I tried to both show you the individual steps building things out, and then nesting together to get to each part. If any errors, let me know and we can try to work them out.
select perStation.*, ss.station_size, case when perStation.ARStatus = 'yes' AND ss.Station_Size = 'ds small' then 'ds standard' else ss.Station_Size end FinalStationSize from ( select pq.Station, max( case when pq.WeekOrder = 1 then VolRanges.StationSize else '' end ) as Wk1Size, max( case when pq.WeekOrder = 2 then VolRanges.StationSize else '' end ) as Wk2Size, max( case when pq.WeekOrder = 3 then VolRanges.StationSize else '' end ) as Wk3Size, max( case when pq.WeekOrder = 4 then VolRanges.StationSize else '' end ) as Wk4Size, -- just to tack on the "AR" status to extended properties max( case when xa.station_id is null then 'no' else 'yes' end ) as ARStatus from ( select lrp.station, DATE_PART("week", lrp.del_date) as WkNum, sum( lrp.volume ) / 7 as WkAvg, -- since the date part of the group by will always be the same -- correlating to the 1st, 2nd, 3rd or 4th week, I now have this -- extra column and just apply MAX(), otherwise SQL will NAG if -- it is not part of group by and its not an aggregate function max( case when DATE_PART("week", lrp.del_date) = 16 then 1 when DATE_PART("week", lrp.del_date) = 26 then 2 when DATE_PART("week", lrp.del_date) = 39 then 3 else 4 end ) as WeekOrder from prophecy_na.na_topology_lrp lrp where -- EXAMPLE, if you KNOW you only care about within a current year -- this will help optimize by pre-qualifying the single year you -- expect the week activity within lrp.del_Date >= '2021-01-01' AND lrp.del_date < '2022-01-01' -- and NOW, only filter out for the 4 specific weeks you are about AND DATE_PART("week", lrp.del_date) in (16,26,39,43) group by lrp.station, DATE_PART("week", lrp.del_date) ) pq JOIN ( select 'ds x-small' stationSize, 0 AtLeast, 20000 ButLessThan UNION select 'ds small', 20000, 36000 UNION select 'ds standard', 36000, 42000 UNION select 'ds large', 42000, 72000 UNION select 'ds x-large', 72000, 9999999 ) VolRanges on pq.WkAvg / 7 >= VolRanges.AtLeast AND pq.WkAvg / 7 < VolRanges.ButLessThan -- and also now getting the "ar" value from the extended properties -- doing a LEFT-JOIN in case no such extended properties so the given -- station is not left out of the query. So here, I am EXPLICITLY -- pulling ONLY IF the ar = 'yes' LEFT JOIN wbr_global.raw_station_extended_attribute xa on pq.station_id = xa.ds AND xa.ar = 'yes' group by pq.Station ) perStation JOIN Station_Size ss on perStation.Wk1Size = ss.wk_14 AND perStation.Wk2Size = ss.wk_27 AND perStation.Wk3Size = ss.wk_40 AND perStation.Wk4Size = ss.wk_47