I have 2 tables: table1 (including id, title, and abstract) and table2 (including id, pubdate, and family_id).
Each id is unique but multiple rows can have the same family_id.
My goal is to display the id, title and pubdate only for the MIN(pubdate) of each family_id so that I should get:
I tried this:
SELECT t1.id, t1.title, nt2.pubdate FROM (SELECT id, family_id, MIN(pubdate) AS pubdate FROM table2 GROUP BY family_id) AS nt2 INNER JOIN table1 t1 ON t1.id = nt2.id
I get the following error message in BigQuery: “SELECT list expression references column id which is neither grouped nor aggregated at [position]“.
I read here that this is due to the fact that all expressions in the respective SELECT list must be either those which are in the GROUP BY or should be with AGGREGATION function.
But, I do need id for merging with table1.
How can I do this?
Advertisement
Answer
In the “Group by” statement, you need to write all the fields you use in the “SELECT” statement. You can see this documentation.
Like this line of code:
GROUP BY Family_id,id
In this case, if the family and id are duplicated in another field, they will group.
Here is an example code:
SELECT t1.id, t1.title, nt2.pubdate FROM ( Select B.* from ( SELECT family_id, MIN(pubdate) AS pubdate FROM table2 GROUP BY family_id ) as A inner join table2 B on A.pubdate = B.pubdate and A.family_id=B.family_id ) AS nt2 INNER JOIN table1 t1 ON t1.id = nt2.id