Skip to content
Advertisement

Creating unique row IDs in SQL using partition

I’m hoping that someone can either tell me where my syntax is wrong or let me know if there’s a better way to achieve my goal. I have taken a large dataset and applied a number of identifying fields to later group the data by. I would like to apply a unique ‘Cohort ID’ to the data that I can use to later group the data by. I was thinking that a Row_Number() Over (Partition) type of structure would work but it clearly isn’t for some reason. I am getting Cohort IDs that are different for my specified fields when they should be the same. My code is below:

To further specify what I am trying to do, if all items in my selected fields below are equal for two different rows of data, I want that to be labeled with the same Cohort ID. If the next row is the same as the first two except for a different ‘Issuer’, for example, I want that to get a different Cohort ID.

Data sample below with my expected Cohort ID

Advertisement

Answer

Your sample results suggest that you want dense_rank() — no gaps:

The partition by splits the counting into different groups. That means that 1 will be repeated in each group. The order by simply orders them.

I assume that you have other columns as well. Duplicates in tables is generally not very useful. I would recommend instead that you remove them.

User contributions licensed under: CC BY-SA
10 People found this is helpful
Advertisement