Skip to content
Advertisement

Will I need to do a join of 4 tables to fulfill this SQL query or is there a simpler method?

I currently need to do the following:

Find the names of all course and the students enrolled on them which take place on a Friday afternoon and have at least 2 students enrolled on it.

I am thinking what I should do is Join the tables titled student, takes, course, section and time_slot together and from there do the SQL query, but this seems overkill to me.

First I would get a list of all the courses and all of the students names that take the courses by joining the takes and students tables.

Then, I would find all of the courses that take place on a Friday afternoon using the time_slot_id, which would have to be “D” OR “F” OR “G” and the day would be “F” to signify Friday. Then I would match the timeslot with the section.

After that I am stuck.

Here is what I have tried:

and

The bottom query works, but does not have the names of the students and the top one is a lot closer to what I would need, but returns an error.

Here is the full mySQL code that I have.

Advertisement

Answer

OK… let’s build this up using your strategy.

First I would get a list of all the courses and all of the students names that take the courses by joining the takes and students tables.

Your query below is the correct start for this – it finds all the courses which have more than one student.

An important caveat to the above is that it doesn’t matter which semester/etc the students do it – just that the course has had more than one enrolled in it at any time, in any timeslot, etc.

To get the other info (names of courses and students) you’d just link this to the course table and takes & student tables e.g.,

Of course, you could change the above so that the sub-query is in the WHERE clause, or in a CTE.

Note that there is a more efficient way of doing the above using Windowed functions, but I’m guessing that’s a bridge too far at the moment.

However, as you said in your strategy – this isn’t the entire answer – we need to filter out the courses.

Then, I would find all of the courses that take place on a Friday afternoon using the time_slot_id, which would have to be “D” OR “F” OR “G” and the day would be “F” to signify Friday. Then I would match the timeslot with the section.

I’m assuming you want the timeslots to be calculated rather than hard-coded (in case you add new timeslots e.g., one that runs on Monday, Tuesday, Thursday mornings). As such, I’ll define an ‘afternoon’ slot as one that starts at or after 12:00pm but before 6:00pm.

We can get the relevant timeslots via the following

The above should result in ‘D’ and ‘G’ (note that F does not count according to your data). I used DISTINCT in the above to remove duplicates in case a given time slot has a double class e.g., 2 classes in a given afternoon.

We then want the relevant course_ids – which we get from joining the above to section (and removing the reference to time_slot_id as we no longer need it).

Now, all we need to do is to filter out the courses from above, using the sections from above. So the SQL to get the (initial) answer is…

Once again, you could use WHERE or CTEs etc instead of the sub-query.

Here is a db<>fiddle with each of the steps from above. Note the first section is hidden – it contains your data structures and data (but with the course times set to int rather than time).

Note – and this is a big note – you need to be very careful about what this represents.

  • Courses may be delivered in different time slots, and in different semesters. For example, CS-101 is delivered in Fall 2009 in time_slot H; it is also delivered in Spring 2010 with time_slot F.
  • The above answer does not care about this. As long as a course has more than one student (at any time), and has at one stage been delivered on a Friday afternoon, then it qualifies. The students won’t necessarily be in class together at the same time – or indeed, you can have students who didn’t even attend on a Friday afternoon (e.g., they were taking the course in another stream/timeslot).

While my answer may give the initial literal answer to the question, the more meaningful question interpretation (“which classes on Friday afternoon have more than 1 student in them at the same time?”) is probably what they’re asking.

And as this looks like homework – I suggest that modifying the above should be your next step. The reason I chose sub-queries to do the filtering in the above answer, is that I find it easier to then modify then for different filtering.

PS Here’s a hint for how/what to modify: The answer above does all its work around the ‘course’ table e.g., it finds students by course, time slots by course, etc. Instead, you may want to focus on the ‘section’ table instead as that provides data about courses at specific times.

User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement