Skip to content
Advertisement

Incrementing count ONLY for duplicates in MySQL

Here is my MySQL table. I updated the question by adding an ‘id’ column to it (as instructed in the comments by others).

id    data_id

1     2355
2     2031
3     1232
4     9867
5     2355
6     4562
7     1232
8     2355

I want to add a new column called row_num to assign an incrementing number ONLY for duplicates, as shown below. Order of the results does not matter.

id    data_id     row_num
3     1232        1
7     1232        2
2     2031        null
1     2355        1
5     2355        2
8     2355        3
6     4562        null
4     9867        null

I followed this answer and came up with the code below. But following code adds a count of ‘1’ to non-duplicate values too, how can I modify below code to add a count only for duplicates?

select data_id,row_num
from (
      select data_id,
             @row:=if(@prev=data_id,@row,0) + 1 as row_num,
             @prev:=data_id
        from my_table
)t

Advertisement

Answer

I am assuming that id is the column that defines the order on the rows.

In MySQL 8 you can use row_number() to get the number of each data_id and a CASE with EXISTS to exclude the rows which have no duplicate.

SELECT t1.data_id,
       CASE 
         WHEN EXISTS (SELECT *
                             FROM my_table t2
                             WHERE t2.data_id = t1.data_id
                                   AND t2.id <> t1.id) THEN
           row_number() OVER (PARTITION BY t1.data_id
                              ORDER BY t1.id)
       END row_num
       FROM my_table t1;

In older versions you can use a subquery counting the rows with the same data_id but smaller id. With an EXISTS in a HAVING clause you can exclude the rows that have no duplicate.

SELECT t1.data_id,
       (SELECT count(*)
               FROM my_table t2
               WHERE t2.data_id = t1.data_id
                     AND t2.id < t1.id
               HAVING EXISTS (SELECT *
                                     FROM my_table t2
                                     WHERE t2.data_id = t1.data_id
                                           AND t2.id <> t1.id)) + 1 row_num
      FROM my_table t1;

db<>fiddle

User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement