I have the following query
SELECT DISTINCT a.uid AS uid, a.creation_date as creation_date, a.activity_date as activity_date, feature1 as feature1, feature2 as feature2, feature3 as feature3, FROM ( table a INNER JOIN table b ON a.uid=b.uid INNER JOIN table c ON c.uid=a.uid ) Where *certain conditions* GROUP BY 1,2,3,4,5,6
and a sample output
uid | creation_date | activity_date | feature1 | feature2 | feature3 |
---|---|---|---|---|---|
id1 | date1 | date1x | asdf | sfsdfd | sdsdf |
id1 | date1 | date1x | asdf | fasaa | asdas |
id1 | date1 | date1x | asdf | sfsdfd | asdas |
id1 | date1 | date1x | aadf | afdsf | adfad |
id2 | date2 | date2x | aadf | afdsd | asdas |
id2 | date2 | date2x | aadf | adsfsdf | sdsdf |
id2 | date2 | date2x | asdas | afdsf | adfad |
id3 | date3 | date3x | aadf | sfsdfd | sdsdf |
id3 | date3 | date3x | aadf | afdsf | sdsdf |
What I would also like to have is counting the number of occurrences of the uid in the output table together with the other retrieved variables.
uid | uid_count | creation_date | activity_date | feature1 | feature2 | feature3 |
---|---|---|---|---|---|---|
id1 | 4 | date1 | date1x | asdf | sfsdfd | sdsdf |
id1 | 4 | date1 | date1x | asdf | fasaa | asdas |
id1 | 4 | date1 | date1x | asdf | sfsdfd | asdas |
id1 | 4 | date1 | date1x | aadf | afdsf | adfad |
id2 | 3 | date2 | date2x | aadf | afdsd | asdas |
id2 | 3 | date2 | date2x | aadf | adsfsdf | sdsdf |
id2 | 3 | date2 | date2x | asdas | afdsf | adfad |
id3 | 2 | date3 | date3x | aadf | sfsdfd | sdsdf |
id3 | 2 | date3 | date3x | aadf | afdsf | sdsdf |
If I just add a line to the SELECT statement above with count (uid)
it’s obviously needed to specify from which table, i.e. a, b, or c, the uid is coming from. Yet I want the count of the uid’s from the output table that was composed by joining the three table.
Thanks a lot for all the tips on how to achieve that.
Advertisement
Answer
Since an analytic(window) function is evaluated after JOIN and GROUP BY clause, you can simply add COUNT(*) OVER (PARTITION BY uid) AS uid_count in the middle of SELECT list.
SELECT DISTINCT a.uid AS uid, COUNT(*) OVER (PARTITION BY a.uid) AS uid_count, a.creation_date as creation_date, a.activity_date as activity_date, feature1 as feature1, feature1 as feature1, feature1 as feature1, FROM ( table a INNER JOIN table b ON a.uid=b.uid INNER JOIN table c ON c.uid=a.uid ) Where *certain conditions* GROUP BY 1,3,4,5,6,7
Note that GROUP BY 1,3,4,5,6,7 became different from the orginal post. (I’m not sure this group by is really necessary.)