I have a table tb_xyz having field1, field2 as indexed.
Now I have two queries:
SELECT `id` FROM `tb_xyz` force index(field1, field2, date_utc) WHERE `field1`=472 AND `field2`=49066 AND `date_utc` >= DATE(NOW() - interval 1 day) AND `date_utc` <= DATE(NOW() + interval 1 day) AND `status` NOT IN (1,4,6) LIMIT 1
and
SELECT count(*) AS cnt FROM `tb_xyz` force index(user_id, fleet_id, job_date_utc) WHERE `field1`=472 AND `field2`=49066 AND `date_utc` >= date(now() - interval 1 day) AND `date_utc` <= date(now() + interval 1 day) AND `status` NOT IN (1,4,6)
Assuming both queries are written to just check if this condition is met at least once. Which of them is more optimal. I tried checking for their execution times, but results were mixed, for small data.
Advertisement
Answer
The first query (SELECT ... LIMIT 1) will perform better as it can stop scanning the table as soon as it finds a matching row while the COUNT(*) query will have to scan the entire table to compute its result. You may find that using EXISTS:
SELECT EXISTS (
SELECT *
FROM `tb_xyz` force index(user_id, fleet_id, job_date_utc)
WHERE `field1`=472
AND `field2`=49066
AND `date_utc` >= date(now() - interval 1 day)
AND `date_utc` <= date(now() + interval 1 day)
AND `status` NOT IN (1,4,6)
)
is even more efficient than the LIMIT query; looking at your execution plan (from EXPLAIN <query>) will let you know if there’s a difference between the two which might cause a performance difference.