Skip to content
Advertisement

How to separate column values by condition (pivot) to fill one row

I have two tables that I’d like do a full outer join where the resulting view separates the values table into two separate columns with one row for each name_id. I have made one approach with a CASE expression to select by type and then use it with pandas to fill in the values and return distinct name_ids.

Name Table

name_id name
1 foo
2 bar
3 doo
4 sue

Values Table

name_id value type
1 90 red
2 95 blue
3 33 red
3 35 blue
4 60 blue
4 20 red

This is a condensed version. In my full table, I need to do this twice with two separate value tables sorted by type, red/blue and control/placebo.

Simple Join

SELECT names_table.name_id, name, value, type
FULL OUTER JOIN values_table
ON names_table.name_id = values_table.name_id
WHERE type in ('red', 'blue')
name_id name value type
1 foo 90 red
2 bar 95 blue
3 doo 33 red
3 doo 35 blue
4 sue 60 blue
4 sue 20 red

Current work around result which I then fix with python and pandas

SELECT names_table.name_id, name, value, type
CASE 
    WHEN type = 'red' THEN value END red,
CASE 
    WHEN type = 'blue' THEN value END blue
FROM names_table
FULL OUTER JOIN values_table
ON names_table.name_id = values_table.name_id
name_id name blue red
1 foo Null 90
2 bar 95 Null
3 doo 35 Null
3 doo Null 33
4 sue 60 Null
4 sue Null 20

This is my desired output below, where I would have the types as columns and just rows for unique name_ids but with value tables 1 and 2.

Desired Output

name_id name blue red
1 foo Null 90
2 bar 95 Null
3 doo 35 33
4 sue 60 20

Advertisement

Answer

I have two tables that I’d like do a full outer join

Why would you? Better explain what you actually want to do instead of the assumed tool to implement it.

Simple pivoting with the aggregate FILTER clause. See:

SELECT name_id, n.name, v.blue, v.red
FROM  (
   SELECT name_id
        , min(value) FILTER (WHERE type = 'blue') AS blue
        , min(value) FILTER (WHERE type = 'red')  AS red
   FROM   values_table
   GROUP  BY 1
   ) v
LEFT   JOIN names_table n USING (name_id);

Produces your desired result.

db<>fiddle here

The LEFT JOIN includes result rows even if no name is found. A FULL [OUTER] JOIN would add names in the result that have no values at all. I think you really want a LEFT [OUTER] JOIN or even a plain [INNER] JOIN.

You can just switch the JOIN type to adapt to your actual requirements. The identical column name “name_id” allows to join with a USING clause. The unqualified name_id in the outer SELECT works for any join type.

Note how I aggregate first and join later. Typically substantially faster. See:

If there can be duplicate values for “red” or “blue”, you’ll have to define how to deal with those.

For more involved queries consider crosstab(). See:

User contributions licensed under: CC BY-SA
8 People found this is helpful
Advertisement