Similar to this question, how can I find if a NULL value exists in an array?
Here are some attempts.
SELECT num, ar, expected, ar @> ARRAY[NULL]::int[] AS test1, NULL = ANY (ar) AS test2, array_to_string(ar, ', ') <> array_to_string(ar, ', ', '(null)') AS test3 FROM ( SELECT 1 AS num, '{1,2,NULL}'::int[] AS ar, true AS expected UNION SELECT 2, '{1,2,3}'::int[], false ) td ORDER BY num; num | ar | expected | test1 | test2 | test3 -----+------------+----------+-------+-------+------- 1 | {1,2,NULL} | t | f | | t 2 | {1,2,3} | f | f | | f (2 rows)
Only a trick with array_to_string
shows the expected value. Is there a better way to test this?
Advertisement
Answer
Postgres 9.5 or later
Or use array_position()
. Basically:
SELECT array_position(arr, NULL) IS NOT NULL AS array_has_null
See demo below.
Postgres 9.3 or later
You can test with the built-in functions array_remove()
or array_replace()
.
Postgres 9.1 or any version
If you know a single element that can never exist in your arrays, you can use this fast expression. Say, you have an array of positive numbers, and -1
can never be in it:
-1 = ANY(arr) IS NULL
Related answer with detailed explanation:
If you cannot be absolutely sure, you could fall back to one of the expensive but safe methods with unnest()
. Like:
(SELECT bool_or(x IS NULL) FROM unnest(arr) x)
or:
EXISTS (SELECT 1 FROM unnest(arr) x WHERE x IS NULL)
But you can have fast and safe with a CASE
expression. Use an unlikely number and fall back to the safe method if it should exist. You may want to treat the case arr IS NULL
separately. See demo below.
Demo
SELECT num, arr, expect , -1 = ANY(arr) IS NULL AS t_1 -- 50 ms , (SELECT bool_or(x IS NULL) FROM unnest(arr) x) AS t_2 -- 754 ms , EXISTS (SELECT 1 FROM unnest(arr) x WHERE x IS NULL) AS t_3 -- 521 ms , CASE -1 = ANY(arr) WHEN FALSE THEN FALSE WHEN TRUE THEN EXISTS (SELECT 1 FROM unnest(arr) x WHERE x IS NULL) ELSE NULLIF(arr IS NOT NULL, FALSE) -- catch arr IS NULL -- 55 ms -- ELSE TRUE -- simpler for columns defined NOT NULL -- 51 ms END AS t_91 , array_replace(arr, NULL, 0) <> arr AS t_93a -- 99 ms , array_remove(arr, NULL) <> arr AS t_93b -- 96 ms , cardinality(array_remove(arr, NULL)) <> cardinality(arr) AS t_94 -- 81 ms , COALESCE(array_position(arr, NULL::int), 0) > 0 AS t_95a -- 49 ms , array_position(arr, NULL) IS NOT NULL AS t_95b -- 45 ms , CASE WHEN arr IS NOT NULL THEN array_position(arr, NULL) IS NOT NULL END AS t_95c -- 48 ms FROM ( VALUES (1, '{1,2,NULL}'::int[], true) -- extended test case , (2, '{-1,NULL,2}' , true) , (3, '{NULL}' , true) , (4, '{1,2,3}' , false) , (5, '{-1,2,3}' , false) , (6, NULL , null) ) t(num, arr, expect);
Result:
num | arr | expect | t_1 | t_2 | t_3 | t_91 | t_93a | t_93b | t_94 | t_95a | t_95b | t_95c -----+-------------+--------+--------+------+-----+------+-------+-------+------+-------+-------+------- 1 | {1,2,NULL} | t | t | t | t | t | t | t | t | t | t | t 2 | {-1,NULL,2} | t | f --!! | t | t | t | t | t | t | t | t | t 3 | {NULL} | t | t | t | t | t | t | t | t | t | t | t 4 | {1,2,3} | f | f | f | f | f | f | f | f | f | f | f 5 | {-1,2,3} | f | f | f | f | f | f | f | f | f | f | f 6 | NULL | NULL | t --!! | NULL | f | NULL | NULL | NULL | NULL | f | f | NULL
Note that array_remove()
and array_position()
are not allowed for multi-dimensional arrays. All expressions to the right of t_93a
only work for 1-dimenstioal arrays.
db<>fiddle here – Postgres 13, with more tests
Old sqlfiddle
Benchmark setup
The added times are from a benchmark test with 200k rows in Postgres 9.5. This is my setup:
CREATE TABLE t AS SELECT row_number() OVER() AS num , array_agg(elem) AS arr , bool_or(elem IS NULL) AS expected FROM ( SELECT CASE WHEN random() > .95 THEN NULL ELSE g END AS elem -- 5% NULL VALUES , count(*) FILTER (WHERE random() > .8) OVER (ORDER BY g) AS grp -- avg 5 element per array FROM generate_series (1, 1000000) g -- increase for big test case ) sub GROUP BY grp;
Function wrapper
For repeated use, I would create a function in Postgres 9.5 like this:
CREATE OR REPLACE FUNCTION f_array_has_null (anyarray) RETURNS bool LANGUAGE sql IMMUTABLE PARALLEL SAFE AS 'SELECT array_position($1, NULL) IS NOT NULL';
PARALLEL SAFE
only for Postgres 9.6 or later.
Using a polymorphic input type this works for any array type, not just int[]
.
Make it IMMUTABLE
to allow performance optimization and index expressions.
But don’t make it STRICT
, which would disable “function inlining” and impair performance because array_position()
is not STRICT
itself. See:
If you need to catch the case arr IS NULL
:
CREATE OR REPLACE FUNCTION f_array_has_null (anyarray) RETURNS bool LANGUAGE sql IMMUTABLE PARALLEL SAFE AS 'SELECT CASE WHEN $1 IS NOT NULL THEN array_position($1, NULL) IS NOT NULL END';
For Postgres 9.1 use the t_91
expression from above. The rest applies unchanged.
Closely related: