Faster counts with mysql by sampling table

Question

I'm looking for a way I can get a count for records meeting a condition but my problem is the table is billions of records long and a basic count(*) is not possible as it times out. I thought that maybe it would be possible to sample the table by doing something like selecting 1/4th of the records. I believe

Accepted Answer

SHOW TABLE STATUS will &#8216;instantly&#8217; give an approximate Row count.  (There is an equivalent SELECT ... FROM information_schema.tables.)  However, this may be significantly far off.A count(*) on an index on any column in the PRIMARY KEY will be faster because it will be smaller.  But this still may not be fast enough.There is no way to &#8220;sample&#8221;.  Or at least no way that is reliably better than SHOW TABLE STATUS.  EXPLAIN SELECT ... with some simple query will do an estimate; again, not necessarily any better.Please describe what kind of data you have; there may be some other tricks we can use.See also Random .  There may be a technique that will help you &#8220;sample&#8221;.  Be aware that all techniques are subject to various factors of how the data was generated and whether there has been &#8220;churn&#8221; on the table.Can you periodically run the full COUNT(*) and save it somewhere?  And then maintain the count after that?I assume you don&#8217;t have this case.  (Else the solution is trivial.)AUTO_INCREMENT idNever DELETEd or REPLACEd or INSERT IGNOREd or ROLLBACKd any rows

Advertisement

Answer