Why selecting a single attribute returns less rows than selecting all columns in oracle SQL

Question

The tables created and the queries made are not the primary focus of this question, what confuses me is that why the first query and the second query returns different numbers of rows I want to find the sid of the sailors who have not ordered all the red boats, the first query above returns the correct rows I…

Accepted Answer

As far as I can tell, this looks like a bug in Oracle&#8217;s implementation of natural [...] join. I will do some testing to see if it affects inner joins too.Instead of natural join, one can use the syntax left|right|inner join USING(...) and giving the list of column names in the using clause. The list of columns should be the list of ALL columns that have the same name in the two members of the join.Very simple experimentation with the data you provided (+1 even for that alone) shows that the results are the same as if we had written using(sid, bid) in the first query, but only using(sid) in the second. If in either query &#8211; regardless of what it selects, whether * or sid &#8211; you use the using syntax, you get the same number of rows in the output as either your first or your second query, depending on what you put in the using clause.So, what Oracle does for the second query is simply wrong. I can only speculate, but I believe Oracle looks at the SELECT clause first, and perhaps at other clauses, to see what columns it needs to retrieve from each table. (For example, this tells the optimizer what indexes it could use, etc.) And at this step, Oracle decides &#8211; in your second query &#8211; that it doesn&#8217;t need bid. Then when it translates &#8220;natural&#8221; join to its own internal code, it doesn&#8217;t throw in bid as a join column. Which is wrong &#8211; and which is why I called this a &#8220;bug&#8221;.IMPORTANT NOTE: Others have commented that both queries are &#8220;wrong&#8221; in that they do not solve the problem you were trying to solve. That may be entirely true. I didn&#8217;t even look at your problem specification; here I am answering your question, which is valid regardless of the problem &#8211; which is, why do the two queries produce different numbers of rows. Even if they are &#8220;wrong&#8221; for your use case, they should essentially give &#8220;the same&#8221; wrong answer, not &#8220;different&#8221; wrong answers. That is the only thing I discussed above.

Advertisement

Answer