Skip to content
Advertisement

How to count SQL data but also have a field that may be present or not?

I’m trying to figure out the right SQL query to group data but also include a boolean (YES/NO) if at least one of the rows grouped has a certain value.

Here’s a concrete example to help visualize:

Let’s say I have some data like this, maybe in a table called “experiments” with fields like this:

id, experiment_name, user_id, created_at, completed_at

And let’s say I have this data:

1, "test experiment", 23, 2019-09-10, NULL

2, "test experiment", 17, 2019-09-15, NULL

3, "test experiment", 23, 2019-09-18, 2019-09-19

4, "test experiment", 23, 2019-09-19, NULL

I want to write a SQL query that does a group_by the user_id so I can have a count, but also includes some boolean value of whether at least one of their rows included a completed_at value.

Thus, I can’t just do:

SELECT user_id, count(*), completed_at from db group by user_id, completed_at;

That ends up like:

user_id  count(*)   completed_at

23            2       NULL

23            1       2019-09-19

17            1       NULL

What I’d like is some query that lets me end up with something like:

user_id     count(*)    did_complete?

23              3          TRUE

17              1          FALSE

But I’m stumped on that part. I tried searching SO here but I can’t figure out how to even word this as a short, searchable question.

Advertisement

Answer

You use an aggregate function with a conditional expression to check if at least one record in the group has a non-null completed_at value:

SELECT 
    user_id, 
    COUNT(*) cnt, 
    MAX(CASE WHEN completed_at IS NOT NULL THEN 1 ELSE 0 END) did_complete
FROM experiments
GROUP BY user_id

I usually prefer using 0/1 rather than TRUE/FALSE since numbers are much more portable across different RDBMS than booleans.

User contributions licensed under: CC BY-SA
4 People found this is helpful
Advertisement