When I started writing database queries I didn’t know the JOIN keyword yet and naturally I just extended what I already knew and wrote queries like this:
SELECT a.someRow, b.someRow
FROM tableA AS a, tableB AS b
WHERE a.ID=b.ID AND b.ID= $someVar
Now that I know that this is the same as an INNER JOIN I find all these queries in my code and ask myself if I should rewrite them. Is there something smelly about them or are they just fine?
My answer summary: There is nothing wrong with this query BUT using the keywords will most probably make the code more readable/maintainable.
My conclusion: I will not change my old queries but I will correct my writing style and use the keywords in the future.
Advertisement
Answer
Filtering joins solely using WHERE
can be extremely inefficient in some common scenarios. For example:
SELECT * FROM people p, companies c
WHERE p.companyID = c.id AND p.firstName = 'Daniel'
Most databases will execute this query quite literally, first taking the Cartesian product of the people
and companies
tables and then filtering by those which have matching companyID
and id
fields. While the fully-unconstrained product does not exist anywhere but in memory and then only for a moment, its calculation does take some time.
A better approach is to group the constraints with the JOIN
s where relevant. This is not only subjectively easier to read but also far more efficient. Thusly:
SELECT * FROM people p JOIN companies c ON p.companyID = c.id
WHERE p.firstName = 'Daniel'
It’s a little longer, but the database is able to look at the ON
clause and use it to compute the fully-constrained JOIN
directly, rather than starting with everything and then limiting down. This is faster to compute (especially with large data sets and/or many-table joins) and requires less memory.
I change every query I see which uses the “comma JOIN
” syntax. In my opinion, the only purpose for its existence is conciseness. Considering the performance impact, I don’t think this is a compelling reason.