Skip to content
Advertisement

Filtering out duplicates yet showing non-duplicate rows from another column in one row

I have a table with multiple duplicate values. What I’d like to do is use some sort of SELECT query that will filter the duplicates by name and yet keep all the appropriate values from the “expertise” column and show them next to each other in one row separated by a coma.

In other words I want to turn a table that looks like this:

 names   | organization | education_years | expertise 
---------+--------------+-----------------+------------------------------------------------------------------------------------------------------------------------------------------
 John    | University   | 2016 – 2020     | Programming
 John    | University   | 2016 – 2020     | machine Learning
 John    | University   | 2016 – 2020     | AI
 John    | University   | 2016 – 2020     | Robotics
 John    | University   | 2016 – 2020     | Symbolic Reasoning
 Gabriel | Company      | 2015 – Present  | models/networks
 Gabriel | Company      | 2015 – Present  | AI
 Gabriel | Company      | 2015 – Present  | Speech/language
 Gabriel | Company      | 2015 – Present  | machine learning
 Gabriel | Company      | 2015 – Present  | Programming
 Nicolas | Company      | Present         | machine learning, factorization, learning theory, supervised learning

Into This:

 names   | organization | education_years | expertise 
---------+--------------+-----------------+------------------------------------------------------------------------------------------------------------------------------------------
 John    | University   | 2016 – 2020     | Programming, machine Learning, AI, Robotics, Symbolic Reasoning
 Gabriel | Company      | 2015 – Present  | models/networks, AI, Speech/language, machine learning, Programming
 Nicolas | Company      | Present         | machine learning, factorization, learning theory, supervised learning

I’m using PostgreSQL as my database.

Can anyone help? Haven’t done queries for so long, and couldn’t find a similar question that addresses this.

Advertisement

Answer

Use aggregation:

select name, organization, education_years,
       string_agg(expertise, ', ')
from t
group by name, organization, education_years;

The above uses string_agg() because the result looks like a string. I would advise you to use arrays instead:

select name, organization, education_years,
       array_agg(expertise)
from t
group by name, organization, education_years;
User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement