when using a case statement to aggregate fields in redshift, is it more performant to replace binary fields with 1s and 0s?

Question

For example, which of the following calculations should perform faster? or For the sake of this example, assume that when fieldA is not null, fieldB will always equal 1. fieldB can also equal 1 if fieldA is null, which is why I use the case statement. Answer The two queries do not do the same thing, unless fieldB is uniformly

Accepted Answer

The two queries do not do the same thing, unless fieldB is uniformly 1 (or uniformly 1 when fieldA is not NULL).  In general, you should run the query that does what you really need.Redshift is a columnar database.  That means that every column used in a query adds overhead to the execution.Hence, it is better to avoid reading a column if you can.  Of course, if the column is referenced elsewhere in the query, then this does not apply.In addition, SUM() operates on numbers.  I&#8217;m not sure if &#8220;binary&#8221; means that the value is a number.  If not, then it needs to be converted, which also adds overhead.

Advertisement

Answer