I have a table trx with following schema:
| id | p_id | | 1 | 1 | | 2 | 1 | | 3 | 4 | ... | 1000 | 2 |
where id is the transaction id and p_id the id of the person doing it.
I need to query trx so I get a table that allows me to plot an histogram of frequencies of transactions, that means, I want to know how many p_id did only 1 transaction, how many did 2, and so on. I have to aggregate 11 and more transactions in a +10 bin:
| n_trx | bin_size | | 1 | 10 | | 2 | 18 | | 3 | 7 | ... | +10 | 26 |
I know that I need the CASE statement for the +10 thing, and I approached the task using two CTE:
WITH new_trx_history
AS (WITH trx_history
AS (SELECT p_id,
Count(DISTINCT id) AS n_trx
FROM trx
GROUP BY p_id)
SELECT CASE
WHEN n_trx < 11 THEN n_trx
ELSE '+10'
END AS n_trx,
Count(*) AS bin_size
FROM trx_history
GROUP BY n_trx)
SELECT n_trx,
Sum(bin_size)
FROM new_trx_history
GROUP BY n_trx;
I would like to know if there is a way to have a more direct approach than my actual (working) query.
Here is the SQLFiddle
Advertisement
Answer
You can use CASE in GROUP BY
WITH trx_history
AS (SELECT p_id,
Count(DISTINCT id) AS n_trx
FROM trx
GROUP BY p_id)
SELECT CASE
WHEN n_trx < 11 THEN n_trx
ELSE '+10'
END AS n_trx,
Count(*) AS bin_size
FROM trx_history
GROUP BY CASE
WHEN n_trx < 11 THEN n_trx
ELSE '+10'
END
;