Skip to content
Advertisement

Select equal number of random rows with respect to a column

Consider I have a table like this

Col1 || Col2
-------------
a    ||  0
b    ||  0
c    ||  1
d    ||  1
e    ||  0

How can I select rows from it so that I have equal number of 1s and 0s, like below can be a result

Col1 || Col2
-------------
a    ||  0
c    ||  1
d    ||  1
e    ||  0

The rows removed/left out are at random and deleting from an existing table would work as well.

Advertisement

Answer

For each col2 partition, you can give each row a row number and then find those rows where there is only one instance of the row number and delete them:

DELETE FROM table_name
WHERE ROWID IN (
  SELECT MIN(ROWID)
  FROM   (
    SELECT ROW_NUMBER() OVER (PARTITION BY col2 ORDER BY DBMS_RANDOM.VALUE)
             AS rn
    FROM   table_name
  )
  GROUP BY rn
  HAVING COUNT(*) < 2
);

If you just want to SELECT the rows then you can use a similar technique:

SELECT col1, col2
FROM   (
  SELECT col1,
         col2,
         COUNT(*) OVER (PARTITION BY rn) AS cnt
  FROM   (
    SELECT col1,
           col2,
           ROW_NUMBER() OVER (PARTITION BY col2 ORDER BY DBMS_RANDOM.VALUE)
             AS rn
    FROM   table_name
  )
)
WHERE cnt = 2;

db<>fiddle here

User contributions licensed under: CC BY-SA
8 People found this is helpful
Advertisement