Skip to content
Advertisement

Postgres – Unique values for id column using CTE, Joins alongside GROUP BY

I have a table referrals:

id | user_id_owner | firstname | is_active | user_type | referred_at
----+---------------+-----------+-----------+-----------+-------------
  3 |             2 | c         | t         | agent     |           3
  5 |             3 | e         | f         | customer  |           5
  4 |             1 | d         | t         | agent     |           4
  2 |             1 | b         | f         | agent     |           2
  1 |             1 | a         | t         | agent     |           1

And another table activations

    id | user_id_owner | referral_id | amount_earned | activated_at | app_id
----+---------------+-------------+---------------+--------------+--------
  2 |             2 |           3 |           3.0 |            3 | a
  4 |             1 |           1 |           6.0 |            5 | b
  5 |             4 |           4 |           3.0 |            6 | c
  1 |             1 |           2 |           2.0 |            2 | b
  3 |             1 |           2 |           5.0 |            4 | b
  6 |             1 |           2 |           7.0 |            8 | a

I am trying to generate another table from the two tables that has only unique values for referrals.id and returns as one of the columns the count for each apps as best_selling_app_count.

Here is the query I ran:

with agents 
    as 
    (select 
    referrals.id, 
    referral_id, 
    amount_earned, 
    referred_at, 
    activated_at, 
    activations.app_id 
    from referrals 
    left outer join activations 
    on (referrals.id = activations.referral_id) 
    where referrals.user_id_owner = 1), 
    distinct_referrals_by_id 
    as 
    (select 
    id, 
    count(referral_id) as activations_count, 
    sum(coalesce(amount_earned, 0)) as amount_earned, 
    referred_at, 
    max(activated_at) as last_activated_at 
    from 
    agents 
    group by id, referred_at), 
    distinct_referrals_by_app_id 
    as 
    (select id, app_id as best_selling_app,
    count(app_id) as best_selling_app_count 
    from agents 
    group by id, app_id ) 
    select *, dense_rank() over (order by best_selling_app_count desc) best_selling_app_rank 
    from distinct_referrals_by_id 
    inner join distinct_referrals_by_app_id 
    on (distinct_referrals_by_id.id = distinct_referrals_by_app_id.id);

Here is the result I got:

id | activations_count | amount_earned | referred_at | last_activated_at | id | best_selling_app | best_selling_app_count | best_selling_app_rank
----+-------------------+---------------+-------------+-------------------+----+------------------+------------------------+-----------------------
  2 |                 3 |          14.0 |           2 |                 8 |  2 | b                |                      2 |                     1
  1 |                 1 |           6.0 |           1 |                 5 |  1 | b                |                      1 |                     2
  2 |                 3 |          14.0 |           2 |                 8 |  2 | a                |                      1 |                     2
  4 |                 1 |           3.0 |           4 |                 6 |  4 | c                |                      1 |                     2

The problem with this result is that the table has a duplicate id of 2. I only need unique values for the id column.

I tried a workaround by harnessing distinct that gave desired result but I fear the query results may not be reliable and consistent. Here is the workaround query:

with agents 
    as 
    (select 
    referrals.id, 
    referral_id, 
    amount_earned, 
    referred_at, 
    activated_at, 
    activations.app_id 
    from referrals 
    left outer join activations 
    on (referrals.id = activations.referral_id) 
    where referrals.user_id_owner = 1), 
    distinct_referrals_by_id 
    as 
    (select 
    id, 
    count(referral_id) as activations_count, 
    sum(coalesce(amount_earned, 0)) as amount_earned, 
    referred_at, 
    max(activated_at) as last_activated_at 
    from 
    agents 
    group by id, referred_at), 
    distinct_referrals_by_app_id 
    as 
    (select 
    distinct on(id), app_id as best_selling_app,
    count(app_id) as best_selling_app_count 
    from agents 
    group by id, app_id 
    order by id, best_selling_app_count desc) 
    select *, dense_rank() over (order by best_selling_app_count desc) best_selling_app_rank 
    from distinct_referrals_by_id 
    inner join distinct_referrals_by_app_id 
    on (distinct_referrals_by_id.id = distinct_referrals_by_app_id.id);

I need a recommendation on how best to achieve this.

Advertisement

Answer

I am trying to generate another table from the two tables that has only unique values for referrals.id and returns as one of the columns the count for each apps as best_selling_app_count.

Your question is really complicated with a very complicated SQL query. However, the above is what looks like the actual question. If so, you can use:

select r.*, 
       a.app_id as most_common_app_id,
       a.cnt as most_common_app_id_count
from referrals r left join
     (select distinct on (a.referral_id) a.referral_id, a.app_id, count(*) as cnt
      from activations a
      group by a.referral_id, a.app_id
      order by a.referral_id, count(*) desc
     ) a
     on a.referral_id = r.id;

You have not explained the other columns that are in your result set.

User contributions licensed under: CC BY-SA
5 People found this is helpful
Advertisement