Skip to content
Advertisement

SQL: difference between PARTITION BY and GROUP BY

I’ve been using GROUP BY for all types of aggregate queries over the years. Recently, I’ve been reverse-engineering some code that uses PARTITION BY to perform aggregations.

In reading through all the documentation I can find about PARTITION BY, it sounds a lot like GROUP BY, maybe with a little extra functionality added in.

Are they two versions of the same general functionality or are they something different entirely?

Advertisement

Answer

They’re used in different places. GROUP BY modifies the entire query, like:

select customerId, count(*) as orderCount
from Orders
group by customerId

But PARTITION BY just works on a window function, like ROW_NUMBER():

select row_number() over (partition by customerId order by orderId)
    as OrderNumberForThisCustomer
from Orders
  • GROUP BY normally reduces the number of rows returned by rolling them up and calculating averages or sums for each row.
  • PARTITION BY does not affect the number of rows returned, but it changes how a window function’s result is calculated.
User contributions licensed under: CC BY-SA
5 People found this is helpful
Advertisement