How inaccurate can the sys.dm_db_partition_stats.row_count be in getting an Azure SQL DB row count for each table?

Question

I have seen a number of general statements on how sys.dm_db_partition_stats.row_count can produce inaccurate results due to providing objects’ statistics instead of actually doing a COUNT(). However, I have never been able to find any deeper reasons behind those statements or validate the hypothesis on my Azure SQL DB.

Accepted Answer

We&#8217;re glad that you found the solution and solved it by yourself. Your new edition should be an answer. I just help you post it as answer and this can be beneficial to other community members:Several things I was able to find out on my own &#8212; mostly by running various queries containing sys.dm_db_partition_stats.row_count, while knowing actual row counts in each table.Here&#8217;s a final query I came up withThis gets fast and (in my case) accurate row count for each table, sorted from high count to low.SELECT     (SCHEMA_NAME(A.schema_id) + '.' + A.Name) as table_name,      B.object_id, B.index_id, B.row_count FROM      sys.dm_db_partition_stats B LEFT JOIN     sys.objects A     ON A.object_id = B.object_id WHERE     SCHEMA_NAME(A.schema_id) <> 'sys'     AND (B.index_id = '0' OR B.index_id = '1') ORDER BY     B.row_count DESC First line of WHERE clause is used to exclude system tables, e.g. sys.plan_persist_wait_stats and many others.Second line takes care of non-unique non-clustered indexes (which are objects and apparently have their own stats) -> if you don&#8217;t filter them out, you get double row count for indexed tables when using GROUP BY A.schema_id, A.Name or two records with the same table_name in the query output (if you don&#8217;t use GROUP BY)Thanks for your sharing again.And thanks for @conor&#8217;s commnet: &#8220;If you want to see how far off the numbers can be, I suggest you try doing user transactions, inserting a bunch of rows, then roll back the transaction.&#8221;

Advertisement

Answer