I have the following query and a sample output uid creation_date activity_date feature1 feature2 feature3 id1 date1 date1x asdf sfsdfd sdsdf id1 date1 date1x asdf fasaa asdas id1 date1 date1x asdf sfsdfd asdas id1 date1 date1x aadf afdsf adfad id2 date2 date2x aadf afdsd asdas id2 date2 date2x aadf adsfsdf sdsdf id2 date2 date2x asdas afdsf adfad id3 date3 date3x

Count records based on a repeating field while returning other values

I have the following query

SELECT
  DISTINCT a.uid AS uid,
  a.creation_date as creation_date,
  a.activity_date as activity_date,
  feature1 as feature1,
  feature2 as feature2,
  feature3 as feature3,   
FROM ( table a
  INNER JOIN
   table b
  ON
    a.uid=b.uid
  INNER JOIN
    table c
  ON
    c.uid=a.uid
   )
Where *certain conditions*
GROUP BY 1,2,3,4,5,6

​x
 
SELECT  DISTINCT a.uid AS uid,  a.creation_date as creation_date,  a.activity_date as activity_date,  feature1 as feature1,  feature2 as feature2,  feature3 as feature3,   FROM ( table a  INNER JOIN   table b  ON    a.uid=b.uid  INNER JOIN    table c  ON    c.uid=a.uid   )Where *certain conditions*GROUP BY 1,2,3,4,5,6​

and a sample output

uid	creation_date	activity_date	feature1	feature2	feature3
id1	date1	date1x	asdf	sfsdfd	sdsdf
id1	date1	date1x	asdf	fasaa	asdas
id1	date1	date1x	asdf	sfsdfd	asdas
id1	date1	date1x	aadf	afdsf	adfad
id2	date2	date2x	aadf	afdsd	asdas
id2	date2	date2x	aadf	adsfsdf	sdsdf
id2	date2	date2x	asdas	afdsf	adfad
id3	date3	date3x	aadf	sfsdfd	sdsdf
id3	date3	date3x	aadf	afdsf	sdsdf

What I would also like to have is counting the number of occurrences of the uid in the output table together with the other retrieved variables.

uid	uid_count	creation_date	activity_date	feature1	feature2	feature3
id1	4	date1	date1x	asdf	sfsdfd	sdsdf
id1	4	date1	date1x	asdf	fasaa	asdas
id1	4	date1	date1x	asdf	sfsdfd	asdas
id1	4	date1	date1x	aadf	afdsf	adfad
id2	3	date2	date2x	aadf	afdsd	asdas
id2	3	date2	date2x	aadf	adsfsdf	sdsdf
id2	3	date2	date2x	asdas	afdsf	adfad
id3	2	date3	date3x	aadf	sfsdfd	sdsdf
id3	2	date3	date3x	aadf	afdsf	sdsdf

If I just add a line to the SELECT statement above with count (uid) it’s obviously needed to specify from which table, i.e. a, b, or c, the uid is coming from. Yet I want the count of the uid’s from the output table that was composed by joining the three table.

Thanks a lot for all the tips on how to achieve that.

Answer

Since an analytic(window) function is evaluated after JOIN and GROUP BY clause, you can simply add COUNT(*) OVER (PARTITION BY uid) AS uid_count in the middle of SELECT list.

SELECT
  DISTINCT a.uid AS uid,
  COUNT(*) OVER (PARTITION BY a.uid) AS uid_count,
  a.creation_date as creation_date,
  a.activity_date as activity_date,
  feature1 as feature1,
  feature1 as feature1,
  feature1 as feature1,   
FROM ( table a
  INNER JOIN
   table b
  ON
    a.uid=b.uid
  INNER JOIN
    table c
  ON
    c.uid=a.uid
   )
Where *certain conditions*
GROUP BY 1,3,4,5,6,7

 
SELECT  DISTINCT a.uid AS uid,  COUNT(*) OVER (PARTITION BY a.uid) AS uid_count,  a.creation_date as creation_date,  a.activity_date as activity_date,  feature1 as feature1,  feature1 as feature1,  feature1 as feature1,   FROM ( table a  INNER JOIN   table b  ON    a.uid=b.uid  INNER JOIN    table c  ON    c.uid=a.uid   )Where *certain conditions*GROUP BY 1,3,4,5,6,7​

Note that GROUP BY 1,3,4,5,6,7 became different from the orginal post. (I’m not sure this group by is really necessary.)

Advertisement

Answer