I am facing this problem at my companies logistics:
Given array of mixed values
arr = [ v1 => a, v2 => b, v3 => c, v4 => d ]
ordered by priority ASC (v1 is more important than v2, … etc)
I need to search values in a table t like this:
select * from t where ... And t.v1 = a And t.v2 = b And t.v3 = c And t.v4 = d
The 3 dots in query are the fixed conditions
If I cant find any value from query, perform same query ignoring the least important value from array
select * from t where ... And t.v1 = a And t.v2 = b And t.v3 = c
Again if value was not found. perform a query ignoring the next least important value
select * from t where ... And t.v1 = a And t.v2 = b And t.v4 = d
The query can ignore all elements from array as last query
select * from t where ...
Repeat that until find at least 1 query result or all elements in array are nulled and no result were found. (return false)
I did my search algorithm using binary numbers to find the value which works like this
binaryValue = (2 ^ arr.lenght) - 1 //(In this case: 2^4-1 = 15)
transform binary 15 in array of boolean exploded = [1, 1, 1, 1] (15 in binary is 1111)
Than make the query considering position 0 of booleanArray exploded if true then Consider value in arr of mixed values and so on
loop from 15 to 0 (the 0 iteration is when the boolean array is [0,0,0,0] and ignores all cases The sequence will be like this with this logic:
[a, b, c, d] // [1,1,1,1] = 15 (select * from t where ... and v1=a and v2=b and v3=c and v4=d) [a, b, c] // [1,1,1,0] = 14 (select * from t where ... and v1=a and v2=b and v3=c) [a, b, d] // [1,1,0,1] = 13 (select * from t where ... and v1=a and v2=b and v4=d) ... [] // [0,0,0,0] = 0 (select * from t where ...)
The search algorithm works fine but I have a huge problem to performance.
The number of queries performed in this search is arr.length! (fatorial) So when the array is length 4, the worst case scenario performs 24 queries. if the arr.length is 6 (what is the length I am dealing now in my code in production) the worst case performs 720 queries which is unacceptable.
I need a way to Improve this search. Can someone help me?
Thanks in advance.
Advertisement
Answer
I came up to the solution using the concept of row’s desirability suggested by @RBarryYoung
Instead of performing multiple queries, I fetch all relevant rows into a datatable (1 query) and for each rows I applied a score based in desirability. (Code side)
Dim dicFields As New Dictionary(Of String, String) From { _ {"v1", a}, _ {"v2", b}, _ {"v3", c}, _ {"v4", d} } Dim intScore As Integer = dicFields.Keys.Count For Each pair As KeyValuePair(Of String, String) In dicFields lstRow _ .Where(Function(p) p.Item(pair.Key) = pair.Value) _ .ToList() _ .ForEach(Sub(p) p.Item("__Score") += intScore) intScore -= 1 Next return lstRow _ .OrderByDescending(Function(p) p.Item("__Score")) _ .FirstOrDefault()
It reduced the complexity of n! to just 1 query and n filters. System are up and running smoothly. Thanks for the help @RBarryYoung