With PostgreSQL how to determine the average completion time given two time stamps while accounting for nulls

In my PostgreSQL database I have an invitations table with the following fields:

Invitations: id, created_at, completed_at (timestamp)

​x
 
Invitations: id, created_at, completed_at (timestamp)​

I am working to write a PostgreSQL query that tells me the average completion time for invitations per weekly cohort:

Here is what I have so far:

SELECT  
    TRUNC(DATE_PART('day', CURRENT_DATE - i.created_at )/7)  AS weeks_ago,
    date(min(i.created_at)) AS "Date Start",
    date(max(i.created_at)) AS "Date End",
    avg(i.completed_at - i.created_at) as average_days_to_complete                
FROM invitations i
GROUP BY weeks_ago
ORDER BY weeks_ago ASC;

 
SELECT      TRUNC(DATE_PART('day', CURRENT_DATE - i.created_at )/7)  AS weeks_ago,    date(min(i.created_at)) AS "Date Start",    date(max(i.created_at)) AS "Date End",    avg(i.completed_at - i.created_at) as average_days_to_complete                FROM invitations iGROUP BY weeks_agoORDER BY weeks_ago ASC;​

The problem with my query is this line:

`avg(i.completed_at - i.created_at) as average_days_to_complete`

`avg(i.completed_at - i.created_at) as average_days_to_complete`

The problem is this select statement is not ignoring invitations that were not completed, meaning i.completed_at is NULL

How can I update the above to only include invitations that were completed?

Answer

How do you want to treat them? You can give them the current timestamp:

SELECT TRUNC(DATE_PART('day', CURRENT_DATE - i.created_at )/7)  AS weeks_ago,
       date(min(i.created_at)) AS "Date Start",
       date(max(i.created_at)) AS "Date End",
      avg(coalesce(i.completed_at, now()) - i.created_at) as average_days_to_complete                
FROM invitations i
GROUP BY weeks_ago
ORDER BY weeks_ago ASC;

 
SELECT TRUNC(DATE_PART('day', CURRENT_DATE - i.created_at )/7)  AS weeks_ago,       date(min(i.created_at)) AS "Date Start",       date(max(i.created_at)) AS "Date End",      avg(coalesce(i.completed_at, now()) - i.created_at) as average_days_to_complete                FROM invitations iGROUP BY weeks_agoORDER BY weeks_ago ASC;​

If you want to ignore them, then your version does that:

avg(i.completed_at - i.created_at)

 
avg(i.completed_at - i.created_at)​

When i.completed_at is NULL, then the result is NULL and avg() ignores NULL values.

Advertisement

Answer