I have these two query examples with small difference that i thought would be performance optimization but there is no difference. The small change is that in one of the queries there is conditional logic within the aggregate while in the other one i use simple math to get the same result. I would have thought that the conditional logic would be more difficult for the RDMS engine to work with than the math logic. But they show identical plans and basically identical(slight change because of warm chache i think) io statistics and execution time.
SELECT fact_hourly.dim_timeseries_key, fact_hourly.dim_date_key, SUM(fact_hourly.energy) sum_energy, SUM( IIF(load_type.is_power_demand_high_load_06_22=1,energy,0) ) sum_hl_energy, fact_hourly.dim_sources_key --@v_dss_update_time FROM core.[fact_hourly] fact_hourly LEFT JOIN core.[ds_hours_load_type] load_type on load_type.dim_date_key = fact_hourly.dim_date_key and load_type.hour_zero_indexed = DATEPART(HOUR,fact_hourly.value_timestamp) WHERE fact_hourly.dim_timeseries_key = 727949 GROUP BY fact_hourly.dim_timeseries_key,fact_hourly.dim_date_key,fact_hourly.dim_sources_key
SELECT fact_hourly.dim_timeseries_key, fact_hourly.dim_date_key, SUM(fact_hourly.energy) sum_energy, SUM( energy*load_type.is_power_demand_high_load_06_22 ) sum_hl_energy, fact_hourly.dim_sources_key --@v_dss_update_time FROM core.[fact_hourly] fact_hourly LEFT JOIN core.[ds_hours_load_type] load_type on load_type.dim_date_key = fact_hourly.dim_date_key and load_type.hour_zero_indexed = DATEPART(HOUR,fact_hourly.value_timestamp) WHERE fact_hourly.dim_timeseries_key = 727949 GROUP BY fact_hourly.dim_timeseries_key,fact_hourly.dim_date_key,fact_hourly.dim_sources_key
Advertisement
Answer
I’m going to focus on the differences with the actual execution plans.
Everything on the right of Adaptive join is exactly the same (Incl. the adaptive join).
Both queries got/requested the same memory.
There is a compute scalar extra when using the “math” query instead of the “IIF” query.
The compute scalar that is doing the math has an extra CPU time of 4ms
Making it more expensive, but only marginally.
<RunTimeCountersPerThread Thread="0" ActualRows="13944" Batches="16" ActualEndOfScans="0" ActualExecutions="1" ActualExecutionMode="Batch" ActualElapsedms="4" ActualCPUms="4" ActualScans="0" ActualLogicalReads="0" ActualPhysicalReads="0" ActualReadAheads="0" ActualLobLogicalReads="0" ActualLobPhysicalReads="0" ActualLobReadAheads="0"/>
When looking at the “extra math compute scalar” we can see It is also suffering from implicit_conversion. This might not be a big issue at the moment, but might get you some milliseconds. It would become more problematic when it prevented using the correct index, which isn’t the case.
CONVERT_IMPLICIT(numeric(3,0),[ELSA].[core].[ds_hours_load_type].[is_power_demand_high_load_06_22] as [load_type].[is_power_demand_high_load_06_22],0
This (the extra compute scalar and the implicit conversion) cost is being passed through to the next execute scalar and hash match and all other operators to the left. Who have a little more estimated work if you look at the estimated sub tree cost (query bucks)
Compute scalar sub tree cost difference
- Math: 1,02816
- IIF: 1,02801
HashMatch sub tree cost difference
- Math: 1,04213
- IIF: 1,04182
When looking deeper at the XML of execution plans. We see something extra for the “IIF” query.
<WaitStats> <Wait WaitType="PAGEIOLATCH_SH" WaitTimeMs="24" WaitCount="1"/> <Wait WaitType="RESERVED_MEMORY_ALLOCATION_EXT" WaitTimeMs="1" WaitCount="164"/> </WaitStats>
Meaning the “IIF” query was fetching from disk and was also waiting to get some memory. I’m assuming you executed “IIF” first and “Math” second. Making all pages available in the buffer pool for the second query.