Skip to content
Advertisement

Is it possible to group sql join results by a common column? Please see the description

I’m super rusty with databases and haven’t got the chance to work too much with them in the past 6-8 years, so there are some things that are uncertain for me.

Let’s assume I have a simple bi-directional one-to-many situation:

Parent
------
id

and

Child
-----
id
age
parent_id

and I want to perform a query which will return all the parents together with the matching children that follow some rules.

For example let’s assume we have

Parent: [Child (age 5), Child (age 8), Child (age 10)]
Parent: [Child (age 8), Child (age 15)]
Parent: [Child (age 20)]

and I want to obtain the parents together with their children filtered for example by age (less than 10).

This situations screams a join with a where statement – something like:

SELECT Parent.id, Child.id, Child.age
    FROM Parent LEFT JOIN Child ON Parent.id = Child.parent_id 
    WHERE Child.age < 10;

And this will work fine and return 3 entries – two rows for parent1 with the children that match the rule and one for parent two and its children.

However, my question is – can I group it somehow based on Parent Id? Can I use LIMIT for limiting the number of Parents? (For example in this case limit 2 will return all results – because there are two parents that match the criteria)

Background context – I’m trying to implement this with Java, Hibernate and a Postgres db. I’m using OneToMany/ManyToOne relationships and CriteriaQuery in order to interogate the DB.

Advertisement

Answer

If I understand you, you want to limit two parents, but this could mean more than two rows because a parent can have more than one child under the age of 10? If so, the count analytic function can start numbering parents, and you can set the limit by taking the resulting count <= 2:

with qualifying_parents as (
  select
    c.id, c.age, c.parent_id,
    count (parent_id) over (partition by parent_id) as cnt
  from
    child c
    join parent p on c.parent_id = p.id
  where
    c.age < 10
)
select id as child_id, age, parent_id
from qualifying_parents
where cnt <= 2

This would give you three rows from your example but only two distinct parent ids.

Also, in this example, the parent table is unnecessary, but I assume it has more attributes than just ID that you intend to use in your query.

Alternatively, you can leverage Postgres` most excellent array functionality and then invoke a simple limit clause at the end of your query:

select
  parent_id, array_agg(age) as children_ages
from child c
where age < 10
group by parent_id
limit 2

(I omitted parent there, but you can put it back obviously).

User contributions licensed under: CC BY-SA
2 People found this is helpful
Advertisement