I am a newbie at SQL/PostgreSQL, and I had a conceptual question about foreign keys and keys in general:
Let’s say I have two tables: Table A
and Table B
.
A
has a bunch of columns, two of which are A.id, A.seq
. The primary key is btree(A.id, A.seq)
and it has a foreign key constraint that A.id
references B.id
. Note that A.seq
is a sequential column (first row has value 1, second has 2, etc).
Now say B
has a bunch of columns, one of which is the above mentioned B.id
. The primary key is btree(B.id)
.
I have the following questions:
- What exactly does
btree
do? What is the significance of having two column names in the btree rather than just one (as inbtree(B.id)
). - Why is it important that
A
referencesB
instead ofB
referencingA
? Why does order matter when it comes to foreign keys??
Thanks so much! Please correct me if I used incorrect terminology for anything.
EDIT: I am using postgres
Advertisement
Answer
A btree index stored values in sorted order, which means you can not only search for a single primary key value, but you can also efficiently search for a range of values:
SELECT ... WHERE id between 6060842 AND 8675309
PostgreSQL also supports other index types, but only btree is supported for a unique index (for example, the primary key).
In your B
table, the primary key being a single column id
means that only one row can exist for each value in id
. In other words, it is unique, and if you search for a value by primary key, it will find at most one row (it may also find zero rows if you don’t have a row with that value).
The primary key in your A
table is for (id, seq)
. This means you can have multiple rows for each value of id
. You can also have multiple rows for each value of seq
as long as they are for different id
values. The combination must be unique though. You can’t have more than one row with the same pair of values.
When a foreign key in A
references B
, it means that the row must exist in B
before you are allowed to store the row in A
with the same id
value. But the reverse is not necessary.
Example:
Suppose B
is for users and A
is for phones. You must store a user before you can store a phone record for that user. You can store one or more phones for that user, one per row in the A
table. We say that a row in A
therefore references the user row in B
, meaning, “this phone belongs to user #1234.”
But the reverse is not restricted. A user might have no phones (at least not known to this database), so there is no requirement for B
to reference A
. In other words, it is not required for a user to have a phone. You can store a row in B
(the user) even if there is no row in A
(the phones) for that user.
The reference also means you are not allowed to DELETE FROM B WHERE id = ?
if there are rows in another table that reference that given row in B
. Deleting that user would cause those other rows to become orphaned. No one would be able to know who those phones belonged to, if the user row they reference is deleted.