Let’s assume I’ve got a database with two tables: people
which contains person’s id and his/her birth year for each person and parents
which contains the (parent_id, child_id)
pairs to represent the relative relationships between people. To make the explanation easier let’s assume each person has either 0 children or 1 child. Here is an example of the data in the database (as a set of SQL statements to create it on MySQL):
CREATE TABLE people ( id INTEGER NOT NULL AUTO_INCREMENT, birth_year INTEGER NOT NULL, CONSTRAINT PRIMARY KEY(id) ); CREATE TABLE parents ( parent_id INTEGER NOT NULL, child_id INTEGER NOT NULL, CONSTRAINT PRIMARY KEY(parent_id,child_id) ); -- Not creating FOREIGN KEYS, because it's just an example INSERT INTO people (id, birth_year) VALUES (1, 1937); INSERT INTO people (id, birth_year) VALUES (2, 1943); INSERT INTO people (id, birth_year) VALUES (3, 1974); INSERT INTO people (id, birth_year) VALUES (4, 2001); INSERT INTO people (id, birth_year) VALUES (5, 2020); INSERT INTO parents (parent_id, child_id) VALUES (1, 4); INSERT INTO parents (parent_id, child_id) VALUES (3, 5);
Result:
Now I want to make up a query which will retrieve the id
of the person, whose child was born at the earliest age of the parent (for example, if I was born in 1234 and my child was born in 1300, then my age when my child was born was 1300 - 1234 = 66
and I would like to find a person which got their child earlier than others).
I have made up some queries for it, but each of them either didn’t work or had duplications or both. The one I like most is
SELECT id AS pid, -- Parent id (SELECT child_id FROM parents WHERE parent_id=pid) AS cid -- Child id FROM people WHERE EXISTS(SELECT cid) -- Only selecting parents who have children (not sure this part is correct) ORDER BY (birth_year - (SELECT birth_year FROM people WHERE id=cid)) ASC -- Order by age when they got their child LIMIT 1;
But this one fails in MySQL with the error:
ERROR 1054 (42S22) at line 24: Unknown column 'cid' in 'field list'
How do I fix the error? Another thing I am worried about is that as a result, I will select not only the parent’s id but also the id of one of his/her children. Is it possible to avoid it? Probably there is a better way to select the data I’m looking for?
Advertisement
Answer
You can get the ages by using joins:
select c.*, (p.birth_year - c.birth_year) as parent_age from parents pa join people p on pa.parent_id = p.id join people c on pa.child_id = pc.id;
To get all the rows with the minimum, use window functions:
select x.* from (select c.*, (p.birth_year - c.birth_year) as parent_age, min(p.birth_year - c.birth_year) over () as min_age from parents pa join people p on pa.parent_id = p.id join people c on pa.child_id = pc.id ) x where parent_age = min_age;