Skip to content
Advertisement

Check whether value exists in column for each group

Having a difficult time putting what I am trying to do into words so searching is also difficult.

Basically I am trying to look whether a certain value exists in a column, partitioned by group, and then propagate that value forward.

In this example I want to check whether a user has completed the tutorial and set a flag that carries forward.

pk | user | ... | activity
 1 |    A | ... |  "login"
 2 |    A | ... |  "started_tutorial"
 3 |    A | ... |  "completed_tutorial"
 4 |    A | ... |  "some other activity"
 5 |    A | ... |  "logout"
 5 |    B | ... |  "login"
 6 |    B | ... |  "logout"

I think this should be something like

select *,
    check(activity in ('completed_tutorial')) as completed_activity
    from tbl

but I don’t think I can use check in a select statement and this would be a constant flag rather than set to true only after it has been found.

Example of what I am trying to get:

pk | user | ... | activity               | completed_tutorial
 1 |    A | ... |  "login"               |                 0
 2 |    A | ... |  "started_tutorial"    |                 0
 3 |    A | ... |  "completed_tutorial"  |                 1
 4 |    A | ... |  "some other activity" |                 1
 5 |    A | ... |  "logout"              |                 1
 5 |    B | ... |  "login"               |                 0
 6 |    B | ... |  "logout"              |                 0

Advertisement

Answer

You can filter SQL groups with the HAVING clause. For example, you can group your table by users and their activity, and then filter it to contain only those that have completed the tutorial:

SELECT user FROM tbl
GROUP BY user, activity
HAVING activity = 'completed_tutorial';

EDIT: After OP has edited their question, this is my new answer. Here, I assume that your table has a date field.

SELECT *, COALESCE(date >= (
    SELECT date FROM tbl WHERE activity = 'completed_tutorial'
    AND user = outertbl.user
), FALSE)
FROM tbl AS outertbl
ORDER BY date

Notice, that such query is essentially N² when unoptimised, so I would recommend instead to just get the data from the database and then process it in your program.

User contributions licensed under: CC BY-SA
4 People found this is helpful
Advertisement