Skip to content
Advertisement

Is there something wrong with joins that don’t use the JOIN keyword in SQL or MySQL?

When I started writing database queries I didn’t know the JOIN keyword yet and naturally I just extended what I already knew and wrote queries like this:

SELECT a.someRow, b.someRow 
FROM tableA AS a, tableB AS b 
WHERE a.ID=b.ID AND b.ID= $someVar

Now that I know that this is the same as an INNER JOIN I find all these queries in my code and ask myself if I should rewrite them. Is there something smelly about them or are they just fine?


My answer summary: There is nothing wrong with this query BUT using the keywords will most probably make the code more readable/maintainable.

My conclusion: I will not change my old queries but I will correct my writing style and use the keywords in the future.

Advertisement

Answer

Filtering joins solely using WHERE can be extremely inefficient in some common scenarios. For example:

SELECT * FROM people p, companies c 
    WHERE p.companyID = c.id AND p.firstName = 'Daniel'

Most databases will execute this query quite literally, first taking the Cartesian product of the people and companies tables and then filtering by those which have matching companyID and id fields. While the fully-unconstrained product does not exist anywhere but in memory and then only for a moment, its calculation does take some time.

A better approach is to group the constraints with the JOINs where relevant. This is not only subjectively easier to read but also far more efficient. Thusly:

SELECT * FROM people p JOIN companies c ON p.companyID = c.id
    WHERE p.firstName = 'Daniel'

It’s a little longer, but the database is able to look at the ON clause and use it to compute the fully-constrained JOIN directly, rather than starting with everything and then limiting down. This is faster to compute (especially with large data sets and/or many-table joins) and requires less memory.

I change every query I see which uses the “comma JOIN” syntax. In my opinion, the only purpose for its existence is conciseness. Considering the performance impact, I don’t think this is a compelling reason.

User contributions licensed under: CC BY-SA
10 People found this is helpful
Advertisement