How to remove values from an array that exist in another array?

Question

In PostgreSQL I have a table (Table) that contains an id column (ID) and another column (Values) that contains an array of strings. I have a select query (SelectQuery) that gets me an ID that matches Table.ID, as well as an array of values (RemoveValues). I would like to now remove from the Values array, any strings that are contained

Accepted Answer

There are many ways.First, you have to define whether there can be duplicates or NULL values in either of the arrays – and how to deal with those if possible. And whether to preserve original order of elements.Assuming either is possible, and these are the rules:Count duplicates separately. So if Values has five elements ‘foo’ and RemoveValues has three elements ‘foo’, thee will be two element ‘foo’ in the result.Treat NULL values equal. So NULL can be removed with NULL (though in other contexts NULL = NULL yields NULL).Order of array elements does not have to be preserved. produces unique ID values.That matches the behavior of standard SQL EXCEPT ALL. See:Using EXCEPT clause in PostgreSQLSo:SELECT t.*, sub.result_arrayFROM tbl tJOIN () s USING ("ID")CROSS JOIN LATERAL ( SELECT ARRAY ( SELECT unnest(t."Values") EXCEPT ALL SELECT unnest(s."RemoveValues") ) ) sub(result_array);About the LATERAL subquery:What is the difference between LATERAL JOIN and a subquery in PostgreSQL?Can be plain CROSS JOIN because an ARRAY constructor always produces a row.You could wrap the functionality in a custom function:CREATE FUNCTION f_arr_except_arr(a1 text[], a2 text[]) RETURNS text[] LANGUAGE SQL IMMUTABLE PARALLEL SAFEBEGIN ATOMICSELECT ARRAY (SELECT unnest(a1) EXCEPT ALL SELECT unnest(a2));END;Requires Postgres 14. See:What does BEGIN ATOMIC … END mean in a Postgresql SQL function / procedure?For older (or any) Postgres versions, and for any array type:CREATE OR REPLACE FUNCTION f_arr_except_arr(a1 anyarray, a2 anyarray) RETURNS anyarray LANGUAGE SQL IMMUTABLE PARALLEL SAFE AS$func$SELECT ARRAY (SELECT unnest(a1) EXCEPT ALL SELECT unnest(a2));$func$;db<>fiddle here(PARALLEL SAFE only for Postgres 9.6 or later, though.)Then your query can be something like:SELECT *, f_arr_except_arr(t."Values", s."RemoveValues") AS result_valuesFROM tbl tJOIN () s USING ("ID");It’s unclear whether you actually locked in CaMeL-case spelling with double-quotes. I assumed as much, but better you don’t. See:Are PostgreSQL column names case-sensitive?And your UPDATE can be:UPDATE tbl tSET "Values" = f_arr_except_arr(t."Values", s."RemoveValues")FROM ) sWHERE s."ID" = t."ID"AND "Values" IS DISTINCT FROM f_arr_except_arr(t."Values", s."RemoveValues");The additional WHERE clause is optional to avoid empty updates. See:Return rows of a table that actually changed in an UPDATEHow do I (or can I) SELECT DISTINCT on multiple columns?

Advertisement

Answer