Skip to content
Advertisement

Using Over Partition in sql compared with group by

Given the table creation code bellow, is there an alternative method(s) to display the same result to

select b.*, count(*) over (partition by colour) bricks_total 
from bricks b;

using group by and count(*)? And what’s the difference in this case?

create table bricks 
(
     brick_id integer,
     colour   varchar2(10),
     shape    varchar2(10),
     weight   integer
);

insert into bricks values (1, 'blue', 'cube', 1);
insert into bricks values (2, 'blue', 'pyramid', 2);
insert into bricks values (3, 'red', 'cube', 1);
insert into bricks values (4, 'red', 'cube', 2);
insert into bricks values (5, 'red', 'pyramid', 3);
insert into bricks values (6, 'green', 'pyramid', 1);

commit;

Advertisement

Answer

This query puts the total for each colour on each row:

select b.*, count(*) over (partition by colour) as bricks_total
from bricks b;

Before window functions, a typical solution would be a correlated subquery:

select b.*,
       (select count(*) from bricks b2 where b2.colour = b.colour) as bricks_total
from bricks b;

You can also express this using join and aggregation:

select b.*, bb.bricks_total
from bricks b join
     (select bb.colour, count(*) as bricks_total
      from bricks bb
      group by bb.colour
     ) bb
     using (colour);

These are not 100% the same. The difference is that the original code will return the count of colour even when the value is NULL. This code returns 0.

So, a more precise equivalent would be:

select b.*,
       (select count(*)
        from bricks b2
        where b2.colour = b.colour or
              b2.colour is null and b.colour is null
       ) as bricks_total
from bricks b;
User contributions licensed under: CC BY-SA
5 People found this is helpful
Advertisement