SQL query – What does taking the MIN of this boolean expression mean?

Question

Excuse my ignorance about this&#8230; I&#8217;m taking a data analysis course and I stumbled upon this query in an exercise: ActivityDate is a field that contains date type data and DATE_REGEX is a regular expression variable for a date format string. What I don&#8217;t know, is what does taking the MIN() of …

Accepted Answer

The query selects rows from the table and applies the REGEXP_CONTAINS() function to every (string-converted) value in the ActivityDate column. REGEXP_CONTAINS() will either return true or false based on whether the value matches the regex pattern in DATE_REGEX.How MIN() behaves here can vary by implementation:Booleans might be coerced as integers, so MIN() is evaluating 0&#8216;s and 1&#8216;s. If all the values are 1 (true), MIN() will be 1 (true), otherwise it will be 0 (false).Other implementations might evaluate booleans directly, so MIN() returns true if all of the values are true, because the minimum value is true (true being &#8220;greater&#8221; than false), otherwise it returns false.The result, based on the implementation, is that MIN() returns 0/1, or false/true. Either way, that result is compared to true in the CASE statement. If all values matched the regex, the comparison will be true.Basically, the query is &#8220;does every row have a valid date in the ActivityDate column?&#8221; The result will be a table with a single column valid_test and one row, containing &#8220;Valid&#8221; if they all match, &#8220;Not Valid&#8221; otherwise.Another way to look at it that would be relatable to some programming languages is that MIN(bool_function()) is analogous to all(), meaning return true if all values are true. Similarly, MAX(bool_function()) would be analogous to any(), meaning return true if any value is true.

Advertisement

Answer