I have a table with multiple duplicate values. What I’d like to do is use some sort of SELECT query that will filter the duplicates by name and yet keep all the appropriate values from the “expertise” column and show them next to each other in one row separated by a coma.
In other words I want to turn a table that looks like this:
names | organization | education_years | expertise ---------+--------------+-----------------+------------------------------------------------------------------------------------------------------------------------------------------ John | University | 2016 – 2020 | Programming John | University | 2016 – 2020 | machine Learning John | University | 2016 – 2020 | AI John | University | 2016 – 2020 | Robotics John | University | 2016 – 2020 | Symbolic Reasoning Gabriel | Company | 2015 – Present | models/networks Gabriel | Company | 2015 – Present | AI Gabriel | Company | 2015 – Present | Speech/language Gabriel | Company | 2015 – Present | machine learning Gabriel | Company | 2015 – Present | Programming Nicolas | Company | Present | machine learning, factorization, learning theory, supervised learning
Into This:
names | organization | education_years | expertise ---------+--------------+-----------------+------------------------------------------------------------------------------------------------------------------------------------------ John | University | 2016 – 2020 | Programming, machine Learning, AI, Robotics, Symbolic Reasoning Gabriel | Company | 2015 – Present | models/networks, AI, Speech/language, machine learning, Programming Nicolas | Company | Present | machine learning, factorization, learning theory, supervised learning
I’m using PostgreSQL as my database.
Can anyone help? Haven’t done queries for so long, and couldn’t find a similar question that addresses this.
Advertisement
Answer
Use aggregation:
select name, organization, education_years, string_agg(expertise, ', ') from t group by name, organization, education_years;
The above uses string_agg()
because the result looks like a string. I would advise you to use arrays instead:
select name, organization, education_years, array_agg(expertise) from t group by name, organization, education_years;