Skip to content
Advertisement

Oracle query to group multi level many to many relationships into independent groups [closed]

BASED on the multi level MANY TO MANY dependency on student & subject cross reference, how to form a ORACLE sql/plsql query to GROUP the Student_Subject table so that groups are independent & exclusive as following? I have 519 rows in actual table. I am looking for a performant solution

Example Source table :

enter image description here

Expected Target output:

enter image description here

Actual Source Table Data creation query

CREATE TABLE student_subject ( student_name, subject_name ) AS
 SELECT 'SMITH', 'CHEMISTRY' FROM DUAL UNION ALL
 SELECT 'ROBIN', 'PHYSICS' FROM DUAL UNION ALL
 SELECT 'SAM', 'PHYSICS' FROM DUAL UNION ALL
 SELECT 'SAM', 'MATH' FROM DUAL UNION ALL
 SELECT 'JENNY', 'MATH' FROM DUAL UNION ALL
 SELECT 'JENNY', 'ACCOUNTS' FROM DUAL UNION ALL
 SELECT 'DON', 'ENGLISH' FROM DUAL UNION ALL
 SELECT 'DON', 'SPANISH' FROM DUAL UNION ALL
 SELECT 'RON', 'HISTORY' FROM DUAL UNION ALL
 SELECT 'JOVAN', 'HISTORY' FROM DUAL UNION ALL
 -- etc. for an additional 509 more rows

See this dbfiddle for a sample set of 475 rows.

Advertisement

Answer

I am looking for a performant solution

If you want performant, then forget about SQL. With your (highly connected) data, there’s just no way to pull a CONNECT BY trick that isn’t going to look at a LOT of extra, unnecessary data.

Here’s a simple PL/SQL block that will do it very quickly. More quickly, by far, I believe, than is possible with pure SQL. (As usual, I’ll happily eat crow on that if it means learning something new…)

This works by finding the records with the lowest subject_name from among the rows that have not yet been assigned to a group. These get the next group number assigned. Then, we loop through indefinitely looking for rows directly related to rows in the new group and assigning those to the new group. When we find no more directly related rows, that group is finished and we start the next group. When we fail to create any new groups, we’re finished.

-- Add a column to student_subject to hold the group number results
ALTER TABLE student_subject ADD ( group_number NUMBER );

-- Let's time it...
SET TIMING ON

-- Go through and assign group numbers to each record.
-- Doing it iteratively in PL/SQL means we can easily look at each row only once.
BEGIN

  UPDATE student_subject SET group_number = null;
  
  LOOP
    UPDATE student_subject ss
    SET group_number = ( SELECT nvl(max(group_number),0)+1 FROM student_subject ss2 )  -- Assign next group number
    WHERE subject_name = ( SELECT min(subject_name) FROM student_subject ss2 WHERE group_number IS NULL );
    
    EXIT WHEN SQL%ROWCOUNT = 0;  -- done..  all rows have groups now.
  
    LOOP
      UPDATE student_subject ss
      SET group_number = ( SELECT max(group_number) FROM student_subject ss2 )
      WHERE group_number is null
      AND EXISTS ( SELECT 'direct relation' FROM student_subject ss2 WHERE (ss2.subject_name = ss.subject_name OR ss2.student_name = ss.student_name ) AND ss2.group_number = ( SELECT max(group_number) FROM student_subject ss3 ) );
      
      EXIT WHEN SQL%ROWCOUNT = 0;  -- finished group.  move to next group
    END LOOP;
  END LOOP;

END;
PL/SQL procedure successfully completed.
Elapsed: 00:00:00.128

You can optimize it further by replacing the (SELECT... expressions in the update with PL/SQL variables. E.g., to keep track of the current group number instead of selecting MAX() every time. I was just lazy.

User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement