Implementation of QueryCache

Question

Aside from doing a direct match on something like a whitespace normalized hash of a query, what might be a useful (but-not-necessarily-perfect) way to handle query cache in a partial manner? For example, let's take the following basic case: This potentially could be used as a 'base-cache' upon which a further query could be executed to potentially improve performance: So,

Accepted Answer

I think the following might be a good starting place for a basic cache implementation that allows the usage of a cache that can be further queried for refinements:Start by substituting any udf&#8217;s or cte&#8217;s. The query itself needs to be self-contained.Normalize whitespaces and capitalization.Hash the entire query. This will be our starting place.Remove the select fields and hash the rest of the query. Now store a hash of all the individual items in the select list.For partial cache, generate a hash minus select fields, where, sort, and limit+offset. Hash the where&#8217;s list (separated by AND), making sure no filter is contained in the cache that is not contained in the current query, the orderby, seeing if the data needs to be re-sorted, and the limit+offset number, making sure the limit+offset in the initial query is null or greater than the current query.Here would be an example of how the data might look saved:Hash673c0185c6a580d51266e78608e8e9b2HashMinusFields41257d239fb19ec0ccf34c36eba1948eHashOfFields[dc99e4006c8a77025c0407c1fdebeed3, …]HashMinusFieldsWhereOrderLimitd50961b6ca0afe05120a0196a93726f5HashOfWheres[0519669bae709d2efdc4dc8db2d171aa, &#8230;]HashOfOrder81961d1ff6063ed9d7515a3cefb0c2a5LimitOffsetnullNow let&#8217;s try a few examples, I will use human-readable hashes for easier readability:SELECT Name, Age FROM Sales WHERE id=2-- fullHash:   selectname,agefromsaleswhereid=2-- selectless: fromsaleswhereid=2-- hashoffields: [name, age]-- minusfieldswhereorderlimit: null-- hashofwheres:  [id=2, ]-- hashororder: null-- limitoffset: null-- query1select age FROM sales where id=2-- selectless: fromsaleswhereid=2-- fields: [age] OK, all fields contained in initial fields-- query2select age FROM sales where id=2 and country='us' order by id limit 100 -- minusfieldswhereorderlimit: null-- hashofwheres: [id=2, country=us] OK initial query does not contain any additional filters-- limitoffset: 100 OK initial limitoffset is null (infinity)-- hashorder: orderbyid--> Can grab partial cache, need to apply one filter and re-sort/limit:    --> SELECT * FROM <cache> WHERE country='us' order by id limit 100Does the above seem like a valid initial implementation?

Hash	673c0185c6a580d51266e78608e8e9b2
HashMinusFields	41257d239fb19ec0ccf34c36eba1948e
HashOfFields	[dc99e4006c8a77025c0407c1fdebeed3, …]
HashMinusFieldsWhereOrderLimit	d50961b6ca0afe05120a0196a93726f5
HashOfWheres	[0519669bae709d2efdc4dc8db2d171aa, …]
HashOfOrder	81961d1ff6063ed9d7515a3cefb0c2a5
LimitOffset	null

Advertisement

Answer