I’m still a relative novice when it comes to designing SQL databases, so apologies if this is something obvious that I’m missing.
I have a few tables of controlled vocabularies for certain values that I’m representing as FKs referencing the controlled vocab tables (there are few distinct vocabularies I’m trying to represent). My schema specification allows each of these vocabularies to also allow a controlled set of values for “unknown” information (coming from DataCite). Here is an example using a table dates
that must specify a date_type
, which should be either a value from date_types
or unknown_values
. I have a few more tables with this model as well, each with their own specific controlled vocabularies, but should also allow values from unknown_values
. So the values in unknown_values
should be shared among many tables of controlled vocabularies with similar structure to date_types
.
CREATE TABLE dates ( date_id integer NOT NULL PRIMARY KEY autoincrement , date_value date NOT NULL DEFAULT CURRENT_DATE , date_type text NOT NULL , FOREIGN KEY ( date_type ) REFERENCES date_types( date_type ) ); CREATE TABLE date_types ( date_type text NOT NULL PRIMARY KEY , definition text ); CREATE TABLE unknown_values ( code text NOT NULL PRIMARY KEY , definition text ); INSERT INTO date_types (date_type, definition) VALUES ('type_a', 'The first date type'), ('type_b', 'The second date type'); INSERT INTO unknown_values (code, definition) VALUES (':unac', 'Temporarily inaccessible'), (':unal', 'Unallowed, suppressed intentionally'), (':unap', 'Not applicable, makes no sense'), (':unas', 'Value unassigned (e.g., Untitled)'), (':unav', 'Value unavailable, possibly unknown'), (':unkn', 'Known to be unknown (e.g., Anonymous, Inconnue)'), (':none', 'Never had a value, never will'), (':null', 'Explicitly and meaningfully empty'), (':tba', 'To be assigned or announced later'), (':etal', 'Too numerous to list (et alia)');
My first thought was a view that creates a union of date_types
and unknown_values
, but you cannot make FK references onto a view, so that’s not suitable.
The “easiest” solution would be to duplicate the values from unknown_values
in each controlled vocabulary table (date_types
etc.), but this feels incorrect to have duplicated values.
I also thought about a single table for all the controlled vocabularies with a third field (something like vocabulary_category
with values like 'date'
), so all my tables could reference that one table, but then I would likely need a function and a CHECK
constraint to ensure that the value has the right “category”. This feels inelegant and messy.
I’m stumped about the best way to proceed, or what to search for to find help. I can’t imagine this is too rare of a requirement, but I can’t seem to find any solutions online. My target DB is SQLite, but I’d be interested in solutions that would be possible in PostgreSQL as well.
Advertisement
Answer
What you are requesting is the ability for a FK to have optional referenced table. Also as discovered Postgres nor SQLite(?) provides this option (afaik neither does any other RDBMS). Postgres at lease offers a work around, I do not know it its doable in SQLite. You need to:
- drop the not null constraint on the currently defined FK
- add a FK column referencing the
unknown_values
table - add
check
constraint that requires exactly 1 on the columnsdate_type
and the new FK column to be null. See the num_nulls function.
Changes you need: ( see demo )
alter table dates alter column date_type drop not null; alter table dates add unknown_value text references unknown_values(code); alter table dates add constraint one_null check (num_nulls(date_type, unknown_value ) = 1);
Note: Postgres does not support the autoincrement
key word. The same is accomplished using a generated column generated always as identity
(for older varsions use serial
).