I’ve read a few articles like this one about a license plate value of NULL (about Droogie from DEF CON 27), including part of chapter three Little Data in the book Humble Pi by Matt Parker (talking about Steve Null), where storing a string value of “NULL” in a database matches NULL values.
With databases I’ve used (at least AFAIK), “NULL” isn’t the same as a NULL value. The state of a field being NULL is stored separately from the value.
So SQL like
SELECT bar FROM foo WHERE foobar IS NULL;
would be different than
SELECT bar FROM foo WHERE foobar = 'NULL';
When I first heard these stories I thought they must be urban legends, but after seeing a few more it made me wonder if I was missing something? Is it a matter of some databases that don’t distinguish “NULL” from is NULL (if so, which ones, any current ones)? Or is it a matter of whoever built the database application storing “NULL” as a string for NULL values or some other poor design?
Further reading:
- BBC on Jennifer Null
- Mashable on Droogie’s DEF CON Talk
- Matt Parker on Steve Null
- Passing NULL as a surname in SOAP
- Droogie’s Go NULL Yourself talk from Def Con 27
- Droogie finally got his plate renewed
To summarize my question:
- Do some databases mishandle NULL vs. “NULL” ?
- Is that only historical databases, or are there current databases that do the same?
- Is there a logical reason to have “NULL” equal IS NULL from architecture POV?
- Or are all of these an example of an application design failure?
- How can you mess it up that bad?
I’d really love to see an example of some SQL that confuses NULL and ‘NULL’.
Advertisement
Answer
You are describing two different concepts:
In relational databases
'null'
is aCHAR
value (of length 4), as simple as that.On the other side
null
is not a value, but represents a “missing value”. It’s not the absence of value either; it means that “the value does exist, but we failed to recover it.”
Therefore, they are both very different concepts. I don’t know a database that mishandles nulls as you present it. However, I can think of applications that do not distinguish them well. I would consider that defects on the app, not on the database engines themselves.
Anyway, here are a few SQL expressions in PostgreSQL and their values to illustrate the definitions above:
select 'null' = 'null', -- 1. TRUE 'null' = null, -- 2. null (actually UNKNOWN) 'null' <> null, -- 3. null (actually UNKNOWN) 'null' is null, -- 4. FALSE 'null' is not null, -- 5. TRUE null is null, -- 6. TRUE null is not null, -- 7. FALSE null = null, -- 8. null (actually UNKNOWN) null <> null, -- 9. null (actually UNKNOWN) null is not distinct from null, -- 10. TRUE (null = null) is unknown, -- 11. TRUE (null = null) is true, -- 12. FALSE (null = null) is false, -- 13. FALSE (null <> null) is unknown -- 14. TRUE
See running example at DB Fiddle.
Note. When you typically compare against null
the result is the bona fide value UNKNOWN
, not TRUE
, not FALSE
. However, most database drivers convert that value to a null
when sending it to your app, as you can see in the cases #2, #3, #8, and #9 above.