Skip to content
Advertisement

Estimated number of rows is way off in execution plan

I have a situation where the estimated number of rows in the execution plan is way off

enter image description here

My columns in the join are varchar(50). I have tried different indexes but it does not reduce this problem. I have even tried with an index on the temp table. What else can I do?

PS this is the first place where the estimated number starts to drift… Also the tables are not big (48000 rows).

The code is:

I know it seems that this can be rewritten as:

But the results are not identical and I don’t want to change the results since I am not sure if the creator of this code knew what they were doing.

Some statistics on the columns: householdnumber is always equal to householdid. householdid is nvarchar(50) but householdnumber is varchar(40). The table has 48877 rows. Distinct combination of householdnumber, householdid, primaryCustomerID has 48029 rows. And distinct number of primaryCustomerID is 47152.

Advertisement

Answer

Regarding the code – it appears that the difference between the larger (original) version and your simpler GROUP BY version is that the original finds the minimum profilecreateddate for anyone in that household, whereas your simpler version finds the profilecreateddate for the specific primarycustomerid.

For example (using simpler data)

If you notice in the above, the PROFILECREATEDATE for row 2 is different.

You could therefore try the following code that should give the same results as the original set – see how that goes for time (and confirm it matches the original results).

User contributions licensed under: CC BY-SA
5 People found this is helpful
Advertisement