Skip to content
Advertisement

Foreign and Primary Key Conceptual Questions

I am a newbie at SQL/PostgreSQL, and I had a conceptual question about foreign keys and keys in general:

Let’s say I have two tables: Table A and Table B.

A has a bunch of columns, two of which are A.id, A.seq. The primary key is btree(A.id, A.seq) and it has a foreign key constraint that A.id references B.id. Note that A.seq is a sequential column (first row has value 1, second has 2, etc).

Now say B has a bunch of columns, one of which is the above mentioned B.id. The primary key is btree(B.id).

I have the following questions:

  1. What exactly does btree do? What is the significance of having two column names in the btree rather than just one (as in btree(B.id)).
  2. Why is it important that A references B instead of B referencing A? Why does order matter when it comes to foreign keys??

Thanks so much! Please correct me if I used incorrect terminology for anything.

EDIT: I am using postgres

Advertisement

Answer

A btree index stored values in sorted order, which means you can not only search for a single primary key value, but you can also efficiently search for a range of values:

SELECT ... WHERE id between 6060842 AND 8675309

PostgreSQL also supports other index types, but only btree is supported for a unique index (for example, the primary key).

In your B table, the primary key being a single column id means that only one row can exist for each value in id. In other words, it is unique, and if you search for a value by primary key, it will find at most one row (it may also find zero rows if you don’t have a row with that value).

The primary key in your A table is for (id, seq). This means you can have multiple rows for each value of id. You can also have multiple rows for each value of seq as long as they are for different id values. The combination must be unique though. You can’t have more than one row with the same pair of values.

When a foreign key in A references B, it means that the row must exist in B before you are allowed to store the row in A with the same id value. But the reverse is not necessary.

Example:

Suppose B is for users and A is for phones. You must store a user before you can store a phone record for that user. You can store one or more phones for that user, one per row in the A table. We say that a row in A therefore references the user row in B, meaning, “this phone belongs to user #1234.”

But the reverse is not restricted. A user might have no phones (at least not known to this database), so there is no requirement for B to reference A. In other words, it is not required for a user to have a phone. You can store a row in B (the user) even if there is no row in A (the phones) for that user.

The reference also means you are not allowed to DELETE FROM B WHERE id = ? if there are rows in another table that reference that given row in B. Deleting that user would cause those other rows to become orphaned. No one would be able to know who those phones belonged to, if the user row they reference is deleted.

User contributions licensed under: CC BY-SA
5 People found this is helpful
Advertisement