Let’s say there’s a table about logging website access.
(id is the only index column) | id | domain | logged_at | |----|------------|--------------------------| | 1 | yahoo.com | 2002-04-08T16:44:32.654Z | | 2 | google.com | 2002-04-02T19:12:12.124Z | | 3 | google.com | 2002-04-01T21:54:29.852Z | | 4 | amazon.com | 2002-03-10T02:11:01.555Z | | 5 | cnn.com | 2002-05-11T09:26:34.292Z | | 6 | amazon.com | 2002-04-02T22:06:18.119Z | ... | 2468955885 | netflix.com | 2011-01-08T16:44:32.654Z | | 2468955886 | facebook.com | 2011-07-02T19:12:12.124Z | | 2468955887 | uber.com | 2011-05-01T21:54:29.852Z | | 2468955888 | google.com | 2011-02-10T02:11:01.555Z | | 2468955889 | uber.com | 2011-04-11T09:26:34.292Z | | 2468955890 | amazon.com | 2011-05-02T22:06:18.119Z |
I want to filter the uber.com
rows.
I can guarantee that uber.com
access log must exist later than 2009-03-01T00:00:00.000Z
.
SQL-A SELECT * FROM table where domain = 'uber.com'; SQL-B SELECT * FROM table where domain = 'uber.com' AND logged_at > '2009-03-01T00:00:00.000Z'
Is there any performance difference between SQL-A and SQL-B?
Advertisement
Answer
In general, no.
I can think of two situations where there would be a performance impact:
- There is an index starting with
logged_at
and no index withdomain
as the first column. - The table is partitioned by
logged_at
.
You did not mention anything in the question to suggest that either of these might be the case.
For a regular query, you want an index on (domain)
or (domain, logged_at)
. Both queries would use both indexes and have very similar performance.
I should note that the second query incurs a small amount of overhead for the unnecessary date comparison. However, that is likely to be close to unmeasurable if you have a large amount of data.