Skip to content
Advertisement

Difference between count(attribute) and count (*) using group by

I haven’t understood the difference between count(*) and count having an attribute as argument.
Here I make an example: there are two relationships:

project (id, name, number);
employee (ssn, name, surname);
working_on (employee_ssn,   project_id);

With employee_ssn which references employee(ssn) and project_id referencing project(id).
Primary keys: project(id), employee(ssn), working_on (employee_ssn, project_id).
Now I have to find for each project, the id of the project, the name and the number of employees working on it.
A correct solution (found on the book) is this one:

select id, name, count (*)
from working_on join project on id=project_id
group by name, id

I have,’t understood why this solution is valid, shouldn’t count (*) count all tuples? Why this way it gets the exact number of employees working on it?
I have written this solution instead:

select id, name, count (employee_ssn)
from working_on join project on id=project_id
group by name, id

Are the solutions equivalent?
And in general, there is a difference using count (*) and count(attribute)? Could you provide an example where these two syntaxes produce different results?

Advertisement

Answer

I figured out two differences till now –

  1. as eric mentioned above, count(col) doesn’t count null values whereas count (*) does.

  2. It behaves differently in a join query.

Consider a situation where i need to join employee and department table, and i need to find count of employees in each department.

2.a.

SELECT     DPT.ID 
,          COUNT(*) EMP_COUNT
FROM       DEPARTMENT DPT
INNER JOIN EMPLOYEE E
ON         DPT.ID = E.DEPARTMENT_ID
GROUP BY   DPT.ID;

In the above query, it will not return those records where employee table count=0.

2.b.

SELECT     DPT.ID 
,          COUNT(E.DEPARTMENT_ID) EMP_COUNT
FROM       DEPARTMENT DPT
INNER JOIN EMPLOYEE E
ON         DPT.ID = E.DEPARTMENT_ID
GROUP BY   DPT.ID;

If you use count(E.DEPARTMENT_ID) instead of count(*), this time it will give records even when employee table count=0.

2.c.

SELECT     DPT.ID 
,          COUNT(DPT.ID) EMP_COUNT
FROM       DEPARTMENT DPT
INNER JOIN EMPLOYEE E
ON         DPT.ID = E.DEPARTMENT_ID
GROUP BY   DPT.ID;

This is the interesting part, when you use COUNT(DPT.ID), since DPT.ID is column of department, here again records where employee table count=0 will NOT be considered.

User contributions licensed under: CC BY-SA
2 People found this is helpful
Advertisement