Skip to content
Advertisement

Extra null rows when using LEFT OUTER JOIN

I have three tables SpringCount, SpringObservation, and MonitoringPoint. SpringCount and SpringObservation particpate in a one to many relationship. Some counts have no bird observations, some counts do. I want to return a table with all the counts represented and nulls in the appropriate fields when they don’t have a matching observation. I understand when doing a left outer join this will produce duplicate rows and that is what I want, however, I am ending up with both the duplicate rows I want and duplicate rows full of nulls that I don’t think should be there. I’m not sure where in my query I’m going wrong.

This is the code I am executing (just trust that all the GDB_TO_DATE field exists and is not null for all records even though it’s not in my example tables above)

This is the table I expect to get

This is the table I actually end up with. I’m not sure why a CountID list all it’s matches, but also produce a (seemingly) random amount of non-matches for that same CountID.

Is there something in my query that needs to be rearranged or am I mistaken in my thinking of how left outer joins work? I have used SELECT DISTINCT and that at least groups all those erroneous null value rows together which helps, but they still exist and I’m not sure why.

Advertisement

Answer

Thanks to @underscore_d as he helped me get the answer. I needed to look at the fields that weren’t part of my query to figure out what was happening. Turns out I needed to do a subquery because the GDB_TO_DATE fields in each table were acting in unexpected ways. What I needed was all the counts that end in a 9999 year to match with any of the observations that ended in a 9999 year. Anything that doesn’t end in a 9999 year is a historical/archived record, but they’re kept in the same table as current records (it’s a spatial database and the software that creates these make them this way, definitely not my choice). Due to the fact that archived and current records are located in the same table, just using the Count IDs to match the records isn’t enough because you’ll end up matching a current count to an archived observation that still has that ID. I needed to select all the current observations before attempting to join to the current counts, that way the archived observations wouldn’t be included in the join.

Here’s the final code:

User contributions licensed under: CC BY-SA
8 People found this is helpful
Advertisement