I have a table tb_xyz
having field1, field2 as indexed.
Now I have two queries:
SELECT `id` FROM `tb_xyz` force index(field1, field2, date_utc) WHERE `field1`=472 AND `field2`=49066 AND `date_utc` >= DATE(NOW() - interval 1 day) AND `date_utc` <= DATE(NOW() + interval 1 day) AND `status` NOT IN (1,4,6) LIMIT 1
and
SELECT count(*) AS cnt FROM `tb_xyz` force index(user_id, fleet_id, job_date_utc) WHERE `field1`=472 AND `field2`=49066 AND `date_utc` >= date(now() - interval 1 day) AND `date_utc` <= date(now() + interval 1 day) AND `status` NOT IN (1,4,6)
Assuming both queries are written to just check if this condition is met at least once. Which of them is more optimal. I tried checking for their execution times, but results were mixed, for small data.
Advertisement
Answer
The first query (SELECT ... LIMIT 1
) will perform better as it can stop scanning the table as soon as it finds a matching row while the COUNT(*)
query will have to scan the entire table to compute its result. You may find that using EXISTS
:
SELECT EXISTS ( SELECT * FROM `tb_xyz` force index(user_id, fleet_id, job_date_utc) WHERE `field1`=472 AND `field2`=49066 AND `date_utc` >= date(now() - interval 1 day) AND `date_utc` <= date(now() + interval 1 day) AND `status` NOT IN (1,4,6) )
is even more efficient than the LIMIT
query; looking at your execution plan (from EXPLAIN <query>
) will let you know if there’s a difference between the two which might cause a performance difference.