Skip to content
Advertisement

How can SELECT COUNT(*) different from count of all records in the table?

So I have a table:

CREATE TABLE TABLE_NAME (
    COLUMN_1   char(12)    NOT NULL,
    COLUMN_2   char(2)     NOT NULL,
    COLUMN_3   char(1)     NOT NULL,
    COLUMN_4   int         NOT NULL,
    COLUMN_5   char(2)     NOT NULL,
    COLUMN_6   money       NOT NULL,
    COLUMN_7   smallint    NOT NULL,
    COLUMN_8   varchar(10) NOT NULL,
    COLUMN_9   smallint    NOT NULL,
    COLUMN_10  datetime    NOT NULL
    Primary Key (COLUMN_1, COLUMN_2, COLUMN_3)
)

SELECT COUNT(*) returns a different value than SELECT DISTINCT COUNT(*). How can this be possible?

I also tried

SELECT COUNT(*) FROM (
    SELECT
        COLUMN_1,
        COLUMN_2,
        COLUMN_3,
        COLUMN_4,
        COLUMN_5,
        COLUMN_6,
        COLUMN_7,
        COLUMN_8,
        COLUMN_9,
        COLUMN_10
     FROM TABLE_NAME
    ) TMP

which returned the same count as the distinct query.

I can’t see how with a primary key and all fields being NOT NULL, there can be a different total count than the count of unique records.

BTW, this is on Sybase ASE 15.

The discrepancy is a hundred or so records out of a half million. I’m also seeing this problem in several other tables, but chose just one for the example.

Edit

I should mention for the sake of completeness that I discovered this problem when writing a simple job to completely copy this table to a remote database. My application was recording a certain number of read/write operations, but failed QA because the number of records in the source database differed from the number of records in the target database. Both values were obtained via COUNT(*); the count returned from the target (Oracle 10g) was the same as the number of read/write operations recorded by my app. As all fields on the source table are defined NOT NULL and a primary key is defined, I was at a loss to explain how my application was losing a tiny number of records.

This is when I started using the alternate queries listed above, both of which agreed with my apps read/write count, as well as the COUNT(*) value returned from the target. In other words, the only value that did not match was the COUNT(*) on the source database.

Advertisement

Answer

In most databases that support it, count(*) doesn’t actually retrieve all records and count them — instead it fetches some metadata field that just tracks the number of rows (or approximate number of rows) presently stored in the table. On the other hand, when you do something that requires working with actual data, the dbms is going to fetch the rows anyway, and it will count them as you would expect it to.

Of course, it’s reasonable to expect that, regardless of how it’s implemented, the result of count(*) would be the same as more a complex but equivalent query. That would suggest then, that (maybe) your table’s metadata is corrupted somehow. (I’d say this one is a good bet — I’m not familiar with sybase specifically, but most dbms have a way to force rebuild the table metrics… that might be worth a try here).

Another possible explanation is that the database’s internal table row counter is actually not designed to be 100% accurate. (this second possibility is pure educated speculation… I don’t actually know whether this is true of Sybase’s row counter or not, but it might be worth further investigation).

good luck!

User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement