Skip to content
Advertisement

Does the column order matter in a WHERE SQL statement

I’m working on a system with a good amount of indexes. Some indexes are simpler than others. I.e. they’re INT, VARCHAR, DATETIME and in some cases ENUMS(maybe 5~25 variations).

Does the WHERE order matter? In other words, would placing the easier to search columns first increase speed/performance?

i.e. Let’s say we have this table and it looks like this

CREATE TABLE IF NOT EXISTS `example_table` (
  `id` int(11) unsigned NOT NULL AUTO_INCREMENT PRIMARY KEY,
  `user_id` int(11) unsigned NOT NULL, -- 1 ~ 4,294,967,295 (non unique)
  `type_id` int(11) unsigned NOT NULL, -- (Enum with 15 values)
  `name` VARCHAR(255) NOT NULL, -- alphanumeric 
  `boolean_value` tinyInt(1) DEFAULT 0, -- only 0 or 1
  `created_date` DATETIME NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

CREATE INDEX `example_table_user_id_index` ON `example_table` (`user_id`);
CREATE INDEX `example_table_type_id_index` ON `example_table` (`type_id`);
CREATE INDEX `example_table_name_index` ON `example_table` (`name`);
CREATE INDEX `example_table_boolean_value_index` ON `example_table` (`boolean_value`);
CREATE INDEX `example_table_created_date_index` ON `example_table` (`created_date`)

Is it correct to say that we want to search for the column which will return the least results first that way the next condition will have less columns to flip through?

i.e. Would these statements give different performance results?

1. SELECT id FROM example_table WHERE boolean_value = ? AND user_id = ? AND type_id = ? LIMIT 1000
2. SELECT id FROM example_table WHERE type_id = ? AND user_id = ? AND boolean_value = ? LIMIT 1000
3. SELECT id FROM example_table WHERE user_id = ? AND type_id = ? AND boolean_value = ? LIMIT 1000

Advertisement

Answer

Does the WHERE order matter? In other words, would placing the easier to search columns first increase speed/performance?

Short answer : no.

Longer answer : SQL is a declarative, not procedural, language. It’s the only declarative language most of us devs deal with regularly. The server software has query planning modules that use various schemes for figuring out the most-likely most efficient way of getting the result set from each query. The query plan can vary, even with the same query and different data. So we tell it what we want, not how to get it.

Sometimes it’s necessary to adjust indexes or refactor queries to get decent performance. You can read about that in the tag. But the refactoring is never as simple as changing the order of terms in WHERE clauses.

And, pro tip: lots of single column indexes are very rarely a good idea. Indexes need to be designed to match the shape of the queries in use. Read this ebook by Marcus Winands: https://use-the-index-luke.com

User contributions licensed under: CC BY-SA
2 People found this is helpful
Advertisement