Skip to content
Advertisement

Why does the following join increase the query time significantly?

I have a star schema here and I am querying the fact table and would like to join one very small dimension table. I can’t really explain the following:

This takes ~12600ms, but of course there is no joined data, so I can’t “resolve” the imp.os_id to something meaningful, so I add a join:

This effectively doubles the execution time of my query. My question is, what did I leave out from the picture? I would think such a small lookup was not causing huge difference in query execution time.

Advertisement

Answer

Rewritten with (recommended) explicit ANSI JOIN syntax:

First of all, your second query might be wrong, if more or less than exactly one match are found in os_desc for every row in impressions.
This can be ruled out if you have a foreign key constraint on os_id in place, that guarantees referential integrity, plus a NOT NULL constraint on bi.impressions.os_id. If so, in a first step, simplify to:

count(*) is faster than count(column) and equivalent here if the column is NOT NULL. And add a column alias for the count.

Faster, yet:

Aggregate first, join later. More here:

User contributions licensed under: CC BY-SA
1 People found this is helpful
Advertisement