Skip to content
Advertisement

With PostgreSQL how to determine the average completion time given two time stamps while accounting for nulls

In my PostgreSQL database I have an invitations table with the following fields:

Invitations: id, created_at, completed_at (timestamp)

I am working to write a PostgreSQL query that tells me the average completion time for invitations per weekly cohort:

Here is what I have so far:

SELECT  
    TRUNC(DATE_PART('day', CURRENT_DATE - i.created_at )/7)  AS weeks_ago,
    date(min(i.created_at)) AS "Date Start",
    date(max(i.created_at)) AS "Date End",
    avg(i.completed_at - i.created_at) as average_days_to_complete                
FROM invitations i
GROUP BY weeks_ago
ORDER BY weeks_ago ASC;

The problem with my query is this line:

`avg(i.completed_at - i.created_at) as average_days_to_complete`

The problem is this select statement is not ignoring invitations that were not completed, meaning i.completed_at is NULL

How can I update the above to only include invitations that were completed?

Advertisement

Answer

How do you want to treat them? You can give them the current timestamp:

SELECT TRUNC(DATE_PART('day', CURRENT_DATE - i.created_at )/7)  AS weeks_ago,
       date(min(i.created_at)) AS "Date Start",
       date(max(i.created_at)) AS "Date End",
      avg(coalesce(i.completed_at, now()) - i.created_at) as average_days_to_complete                
FROM invitations i
GROUP BY weeks_ago
ORDER BY weeks_ago ASC;

If you want to ignore them, then your version does that:

avg(i.completed_at - i.created_at)

When i.completed_at is NULL, then the result is NULL and avg() ignores NULL values.

User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement