Skip to content
Advertisement

Do any databases get NULL and “NULL” confused, or is it always an application design failure?

I’ve read a few articles like this one about a license plate value of NULL (about Droogie from DEF CON 27), including part of chapter three Little Data in the book Humble Pi by Matt Parker (talking about Steve Null), where storing a string value of “NULL” in a database matches NULL values.

With databases I’ve used (at least AFAIK), “NULL” isn’t the same as a NULL value. The state of a field being NULL is stored separately from the value.

So SQL like

SELECT bar 
FROM foo
WHERE foobar IS NULL;

would be different than

SELECT bar 
FROM foo
WHERE foobar = 'NULL';

When I first heard these stories I thought they must be urban legends, but after seeing a few more it made me wonder if I was missing something? Is it a matter of some databases that don’t distinguish “NULL” from is NULL (if so, which ones, any current ones)? Or is it a matter of whoever built the database application storing “NULL” as a string for NULL values or some other poor design?

Further reading:

To summarize my question:

  • Do some databases mishandle NULL vs. “NULL” ?
    • Is that only historical databases, or are there current databases that do the same?
  • Is there a logical reason to have “NULL” equal IS NULL from architecture POV?
  • Or are all of these an example of an application design failure?
    • How can you mess it up that bad?

I’d really love to see an example of some SQL that confuses NULL and ‘NULL’.

Advertisement

Answer

You are describing two different concepts:

  • In relational databases 'null' is a CHAR value (of length 4), as simple as that.

  • On the other side null is not a value, but represents a “missing value”. It’s not the absence of value either; it means that “the value does exist, but we failed to recover it.”

Therefore, they are both very different concepts. I don’t know a database that mishandles nulls as you present it. However, I can think of applications that do not distinguish them well. I would consider that defects on the app, not on the database engines themselves.

Anyway, here are a few SQL expressions in PostgreSQL and their values to illustrate the definitions above:

select 
  'null' = 'null', -- 1. TRUE
  'null' = null,   -- 2. null (actually UNKNOWN)
  'null' <> null,  -- 3. null (actually UNKNOWN)
  'null' is null,     -- 4. FALSE
  'null' is not null, -- 5. TRUE
  null is null,       -- 6. TRUE
  null is not null,   -- 7. FALSE
  null = null,        -- 8. null (actually UNKNOWN)
  null <> null,       -- 9. null (actually UNKNOWN)
  null is not distinct from null, -- 10. TRUE
  (null = null) is unknown,       -- 11. TRUE
  (null = null) is true,          -- 12. FALSE
  (null = null) is false,         -- 13. FALSE
  (null <> null) is unknown       -- 14. TRUE

See running example at DB Fiddle.

Note. When you typically compare against null the result is the bona fide value UNKNOWN, not TRUE, not FALSE. However, most database drivers convert that value to a null when sending it to your app, as you can see in the cases #2, #3, #8, and #9 above.

User contributions licensed under: CC BY-SA
1 People found this is helpful
Advertisement