Skip to content
Advertisement

SQL multi query

I need some help to do it right in one query (if it possible). (this is a theoretical example and I assume the presence of events in event_name(like registration/action etc)

I have 3 colums:

-user_id
-event_timestamp
-event_name

From this 3 columns we need to create new table with 4 new columns:

-user year and month registration time
-number of new user registration in this month
-number of users who returned to the second calendar month after registration
-return probability

Result must be looks like this:

2019-1 | 1 | 1 | 100%
2019-2 | 3 | 2 | 67%
2019-3 | 2 | 0 | 0%

What I’ve done now: I’m use this toy example of my possible main table:

CREATE TABLE `main` (
  `event_timestamp` timestamp,
  `user_id` int(10),
  `event_name` char(12)
) DEFAULT CHARSET=utf8;
INSERT INTO `main` (`event_timestamp`, `user_id`, `event_name`) VALUES
  ('2019-01-23 20:02:21.550', '1', 'registration'),
  ('2019-01-24 20:03:21.550', '2', 'action'),
  ('2019-02-21 20:04:21.550', '3', 'registration'),
  ('2019-02-22 20:05:21.550', '4', 'registration'),
  ('2019-02-23 20:06:21.550', '5', 'registration'),
  ('2019-02-23 20:06:21.550', '1', 'action'),
  ('2019-02-24 20:07:21.550', '6', 'action'),
  ('2019-03-20 20:08:21.550', '3', 'action'),
  ('2019-03-21 20:09:21.550', '4', 'action'),
  ('2019-03-22 20:10:21.550', '9', 'action'),
  ('2019-03-23 20:11:21.550', '10', 'registration'),
  ('2019-03-22 20:10:21.550', '4', 'action'),
  ('2019-03-22 20:10:21.550', '5', 'action'),
  ('2019-03-24 20:11:21.550', '11', 'registration');

I’m trying to test some queries to create 4 new columns:

This is for column #1, we select month and year from timestamp where action is registration (as I guess), but I need to sum it for month (like 2019-11, 2019-12)

SELECT DATE_FORMAT(event_timestamp, '%Y-%m') AS column_1 FROM main
WHERE event_name='registration';

For column #2 we need to sum users with even_name registration in this month for every month, or.. we can trying for searching first time activity by user_id, but I don’t know how to do this.

Here is some thinks about it…

SELECT COUNT(DISTINCT user_id) AS user_count
FROM main
GROUP BY MONTH(event_timestamp);
SELECT COUNT(DISTINCT user_id) AS user_count FROM main
WHERE event_name='registration';

For column #3 we need to compare user_id with the event_name registration and last month event with any event of the second month so we get users who returned for the next month.

Any idea how to create this query?

This is how to calc column #4

SELECT *,
ROUND ((column_3/column_2)*100) AS column_4
FROM main;

Advertisement

Answer

I hope you will find the following answer helpful.

The first column is the extraction of year and month. The new_users column is the COUNT of the unique user ids when the action is ‘registration’ since the user can be duplicated from the JOIN as a result of taking multiple actions the following month. The returned_users column is the number of users who have an action in the next month from the registration. The returned_users column needs a DISTINCT clause since a user can have multiple actions during one month. The final column is the probability that you asked from the two previous columns.

The JOIN clause is a self-join to bring the users that had at least one action the next month of their registration.

SELECT CONCAT(YEAR(A.event_timestamp),'-',MONTH(A.event_timestamp)),
  COUNT(DISTINCT(CASE WHEN  A.event_name LIKE 'registration' THEN A.user_id END)) AS new_users,
  COUNT(DISTINCT B.user_id) AS returned_users,
  CASE WHEN COUNT(DISTINCT(CASE WHEN  A.event_name LIKE 'registration' THEN A.user_id END))=0 THEN 0 ELSE COUNT(DISTINCT B.user_id)/COUNT(DISTINCT(CASE WHEN  A.event_name LIKE 'registration' THEN A.user_id END))*100 END AS My_Ratio
FROM main AS A
LEFT JOIN main AS B
ON A.user_id=B.user_id AND MONTH(A.event_timestamp)+1=MONTH(B.event_timestamp)
  AND A.event_name='registration' AND B.event_name='action'
GROUP BY CONCAT(YEAR(A.event_timestamp),'-',MONTH(A.event_timestamp))
User contributions licensed under: CC BY-SA
2 People found this is helpful
Advertisement