I have two tables that I’d like do a full outer join
where the resulting view separates the values table into two separate columns with one row for each name_id
. I have made one approach with a CASE
expression to select by type and then use it with pandas to fill in the values and return distinct name_ids.
Name Table
name_id | name |
---|---|
1 | foo |
2 | bar |
3 | doo |
4 | sue |
Values Table
name_id | value | type |
---|---|---|
1 | 90 | red |
2 | 95 | blue |
3 | 33 | red |
3 | 35 | blue |
4 | 60 | blue |
4 | 20 | red |
This is a condensed version. In my full table, I need to do this twice with two separate value tables sorted by type, red/blue and control/placebo.
Simple Join
SELECT names_table.name_id, name, value, type FULL OUTER JOIN values_table ON names_table.name_id = values_table.name_id WHERE type in ('red', 'blue')
name_id | name | value | type |
---|---|---|---|
1 | foo | 90 | red |
2 | bar | 95 | blue |
3 | doo | 33 | red |
3 | doo | 35 | blue |
4 | sue | 60 | blue |
4 | sue | 20 | red |
Current work around result which I then fix with python and pandas
SELECT names_table.name_id, name, value, type CASE WHEN type = 'red' THEN value END red, CASE WHEN type = 'blue' THEN value END blue FROM names_table FULL OUTER JOIN values_table ON names_table.name_id = values_table.name_id
name_id | name | blue | red |
---|---|---|---|
1 | foo | Null | 90 |
2 | bar | 95 | Null |
3 | doo | 35 | Null |
3 | doo | Null | 33 |
4 | sue | 60 | Null |
4 | sue | Null | 20 |
This is my desired output below, where I would have the types as columns and just rows for unique name_ids but with value tables 1 and 2.
Desired Output
name_id | name | blue | red |
---|---|---|---|
1 | foo | Null | 90 |
2 | bar | 95 | Null |
3 | doo | 35 | 33 |
4 | sue | 60 | 20 |
Advertisement
Answer
I have two tables that I’d like do a
full outer join
…
Why would you? Better explain what you actually want to do instead of the assumed tool to implement it.
Simple pivoting with the aggregate FILTER
clause. See:
SELECT name_id, n.name, v.blue, v.red FROM ( SELECT name_id , min(value) FILTER (WHERE type = 'blue') AS blue , min(value) FILTER (WHERE type = 'red') AS red FROM values_table GROUP BY 1 ) v LEFT JOIN names_table n USING (name_id);
Produces your desired result.
db<>fiddle here
The LEFT JOIN
includes result rows even if no name is found.
A FULL [OUTER] JOIN
would add names in the result that have no values at all. I think you really want a LEFT [OUTER] JOIN
or even a plain [INNER] JOIN
.
You can just switch the JOIN
type to adapt to your actual requirements. The identical column name “name_id” allows to join with a USING
clause. The unqualified name_id
in the outer SELECT
works for any join type.
Note how I aggregate first and join later. Typically substantially faster. See:
If there can be duplicate values for “red” or “blue”, you’ll have to define how to deal with those.
For more involved queries consider crosstab()
. See: