I am using PostgreSQL and am having difficulty with getting a series of queries that combine the data from two tables (t1, t2)
t1 is
studyida | gender | age |
---|---|---|
a | M | 1 |
a | M | 2 |
a | M | 3 |
b | F | 4 |
b | F | 5 |
b | F | 6 |
c | M | 13 |
c | M | 14 |
c | M | 15 |
and t2 is
studyida | studyidb | gender | age |
---|---|---|---|
a | z | M | 3 |
a | z | M | 4 |
a | z | M | 5 |
NULL | y | F | 7 |
NULL | y | F | 8 |
NULL | y | F | 9 |
c | x | M | 10 |
c | x | M | 11 |
c | x | M | 12 |
NULL | w | F | 7 |
NULL | w | F | 8 |
NULL | w | F | 9 |
NULL | u | M | 7 |
NULL | u | M | 8 |
NULL | u | M | 9 |
t1 and t2 are related via StudyIDA and gender. What I need is a comprehensive listing from both tables, including including the ages. Sometimes the age in t1 equals the age in t2 (e.g. for StudyIDA=a, age=3) but most of the time it does not.
I am looking to create a table like this
StudyIDA | StudyIDB | gender | ageA | ageB |
---|---|---|---|---|
a | z | M | 1 | |
a | z | M | 2 | |
a | z | M | 3 | 3 |
a | z | M | 4 | |
a | z | M | 5 | |
b | NULL | F | 4 | |
b | NULL | F | 5 | |
b | NULL | F | 6 | |
NULL | y | F | 7 | |
NULL | y | F | 8 | |
NULL | y | F | 9 | |
c | x | F | 13 | |
c | x | F | 14 | |
c | x | F | 15 | |
c | x | F | 10 | |
c | x | F | 11 | |
c | x | F | 12 | |
NULL | w | F | 7 | |
NULL | w | F | 8 | |
NULL | w | F | 9 | |
NULL | u | M | 7 | |
NULL | u | M | 8 | |
NULL | u | M | 9 |
I was thinking that first a full outer join of t1 and t2 would give me what I want but it does not.
Then I was thinking I need a listing of all the individuals (lets call it t3), and then do a series of inserts (e.g. t1+t3 and also t1+t3) into a new table to ‘construct’ what I need. I am really stuck on the odd times when age in t1 equals the age in t2 (e.g. for StudyIDA=a, age=3).
I am still not getting what I need. Here is my code so far
DROP TABLE IF EXISTS t1, t2, t3;
CREATE TEMPORARY TABLE t1 (StudyIDA VARCHAR, gender VARCHAR, age int);
INSERT INTO t1 VALUES
('a','M', 1),('a','M', 2),('a','M', 3),
('b','F', 4),('b','F', 5),('b','F', 6),
('c','M', 13),('c','M', 14),('c','M', 15);
SELECT * FROM t1;
CREATE TEMPORARY TABLE t2 (StudyIDA VARCHAR, StudyIDB varchar, gender VARCHAR, age int);
INSERT INTO t2 VALUES
('a','z','M', 3), ('a','z','M', 4), ('a','z','M', 5),
(NULL,'y','F', 7),(NULL,'y','F', 8),(NULL,'y','F', 9),
('c','x','M', 10),('c','x','M', 11),('c','x','M', 12),
(NULL,'w','F', 7),(NULL,'w','F', 8),(NULL,'w','F', 9),
(NULL,'u','M', 7),(NULL,'u','M', 8),(NULL,'u','M', 9);
SELECT * FROM t2;
CREATE TEMPORARY TABLE t3 (StudyIDA_t1 VARCHAR, gender_t1 VARCHAR, StudyIDA_t2 VARCHAR,StudyIDB varchar,
gender_t2 VARCHAR);
INSERT INTO t3
SELECT * FROM (SELECT DISTINCT StudyIDA, gender FROM t1) a FULL OUTER JOIN
(SELECT DISTINCT StudyIDA, StudyIDB, gender FROM t2) b
ON a.StudyIDA=b.StudyIDA AND a.gender=b.gender
ORDER BY a.StudyIDA;
SELECT * FROM t3 ORDER BY StudyIDA_t1;
SELECT 'IN t1', *
FROM t3 JOIN t1 on t1.StudyIDA=t3.StudyIDA_t1 AND t1.gender=t3.gender_t1
ORDER BY StudyIDA_t1, StudyIDB;
SELECT 'In t2',*
FROM t3 JOIN t2 on t3.StudyIDA_t1=t2.StudyIDA AND t3.gender_t1=t2.gender
ORDER BY StudyIDA_t1, t3.StudyIDB;
DROP TABLE IF EXISTS t1, t2, t3;
Advertisement
Answer
A full join that includes the age maybe?
And some coalesce’s for common fields.
SELECT DISTINCT
COALESCE(t1.StudyIDA, t2.StudyIDA) AS StudyIDA
, t2.StudyIDB
, COALESCE(t1.gender, t2.gender) AS gender
, t1.age as ageA
, t2.age as ageB
FROM t1
FULL JOIN t2
ON t2.StudyIDA is not distinct from t1.StudyIDA
AND t2.gender = t1.gender
AND t2.age = t1.age
ORDER BY StudyIDA, gender, ageA, ageB;
studyida | studyidb | gender | agea | ageb :------- | :------- | :----- | ---: | ---: a | null | M | 1 | null a | null | M | 2 | null a | z | M | 3 | 3 a | z | M | null | 4 a | z | M | null | 5 b | null | F | 4 | null b | null | F | 5 | null b | null | F | 6 | null c | null | M | 13 | null c | null | M | 14 | null c | null | M | 15 | null c | x | M | null | 10 c | x | M | null | 11 c | x | M | null | 12 null | w | F | null | 7 null | y | F | null | 7 null | w | F | null | 8 null | y | F | null | 8 null | w | F | null | 9 null | y | F | null | 9 null | u | M | null | 7 null | u | M | null | 8 null | u | M | null | 9
db<>fiddle here