Compare Two Relations in SQL

Question

I just started studying SQL and this is a demo given by the teacher in an online course and it works fine. The statement is looking for "students such that number of other students with same GPA is equal to number of other students with same sizeHS": It seems that in this where clause, we're comparing two relations (because the

Accepted Answer

because the result of a subquery is a relationRelation is the scientific name for what we call a table in a database and I like the name “table” much better than “relation”. A table is easy to imagine. We know them from our school time schedule for instance. Yes, we relate things here inside a table (day and time and the subject taught in school), but we can also relate tables to tables (pupils’ timetables with the table of class rooms, the overall subject schedule, and the teacher’s timetables). As such, tables in an RDBMS are also related to each other (hence the name relational database management system). I find the name relation for a table quite confusing (and many people use the word “relation” to describe the relations between tables instead).So, yes, a query result itself is again a table (“relation”). And from tables we can of course select:select * from (select * from b) as subq;And then there are scalar queries that return exactly one row and one column. select count(*) from b is such a query. While this is still a table we can select fromselect * from (select count(*) as cnt from b) as subq;we can even use them where we usually have single values, e.g. in the select clause:select a.*, (select count(*) from b) as cnt from a;In your query you have two scalar subqueries in your where clause.With subqueries there is another distinction to make: we have correlated and non-correlated subqueries. The last query I have just shown contains a non-correlated subquery. It selects the count of b rows for every single result row, no matter what that row contains elsewise. A correlated subquery on the other hand may look like this:select a.*, (select count(*) from b where b.x = a.y) as cnt from a;Here, the subquery is related to the main table. For every result row we look up the count of b rows matching the a row we are displaying via where b.x = a.y, so the count is different from row to row (but we’d get the same count for a rows sharing the same y value).Your subqueries are also correlated. As with the select clause, the where clause deals with one row at a time (in order to keep or dismiss it). So we look at one student S1 at a time. For this student we count other students (S2, where S2.sID <> S1.sID) who have the same GPA (and S2.GPA = S1.GPA) and count other students who have the same sizeHS. We only keep students (S1) where there are exactly as many other students with the same GPA as there are with the same sizeHS.UPDATEAs do dealing with multiple tuples as inselect *from Student S1where ( select count(*), avg(grade) from Student S2 where S2.sID <> S1.sID and S2.GPA = S1.GPA) = ( select count(*), avg(grade) from Student S2 where S2.sID <> S1.sID and S2.sizeHS = S1.sizeHS);this is possible in some DBMS, but not in SQL Server. SQL Server doesn’t know tuples.But there are other means to achieve the same. You could just add two subqueries:select * from student s1where (...) = (...) -- compare counts hereand (...) = (...) -- compare averages hereOr get the data in the FROM clause and then deal with it. E.g.:select *from Student S1cross apply( select count(*) as cnt, avg(grade) as avg_grade from Student S2 where S2.sID <> S1.sID and S2.GPA = S1.GPA) sxcross apply( select count(*) as cnt, avg(grade) as avg_grade from Student S2 where S2.sID <> S1.sID and S2.sizeHS = S1.sizeHS) sywhere sx.cnt = sy.cnt and sx.avg_grade = sy.avg_grade;

Advertisement

Answer

UPDATE