Skip to content
Advertisement

Canonical way to store reciprocal data sql

I got a lot of instances of the same Class. Now these objects can be linked and this link can have a weight. Like in an undirected graph. Now I want to store the relationship between each two objects in my mysql database.

Data looks like this

a ===WEIGHT=== b
a ===WEIGHT=== c
b ===WEIGHT=== a
...

I could do create a table with this structure:

object1_id | object2_id | weight

But when searching for the weight between two objects I wouldn’t know which one is object1 and which one is object2. So I’d need to write two queries. Also If I want to add the weight and want to check first if it is already in my database, I’d have to write two queries to make sure it is not in there.

The two queries would be:

SELECT weight from tableName where object1_id = $1 AND object2_id = $2;
SELECT weight from tableName where object1_id = $2 AND object2_id = $1;

to make sure to check for both possibilities as I don’t know how it was stored.

I guess that there is a better way how to store data like that. Probably there are already many answers out there for this particular problem but I didn’t know how this is called hence it was hard to find solutions for this.

Thank you for your advice.

Advertisement

Answer

If only one edge is in the data, then you can use:

select t.*
from t
where (object1_id, object2_id) in ( ($1, $2), ($2, $1) );

If both edges might be in the table and you arbitrarily want one, then add limit 1.

If you want to enforce that only one edge is in the database you can use a unique index on expressions:

create unique index unq_t_object1_id_object2_id on
    t( (least(object1_id, object2_id)), (greatest(object1_id, object2_id)) );

You might also want to add a check constraint so they are always in order:

alter table t add constraint chk_object1_id_object2_id check (object1_id < object2_id);

With these conditions enforced, you can change the above query to:

select t.*
from t
where object1_id = least($1, $2) and
      object2_id = greatest($1, $2);

This is actually easier for the optimizer to optimize.

User contributions licensed under: CC BY-SA
5 People found this is helpful
Advertisement