# Calculate interval between boolean column change

I have a table with measurements of weather here is a simplified version of it:

```"station_id","measured_at","rainy"
-------------------------------------------------------------------------
"b6b53561-dab5-4b9a-8d28-a7de1e4d1485","2020-01-31 18:35:35.27+00",FALSE
"b6b53561-dab5-4b9a-8d28-a7de1e4d1485","2020-01-31 18:36:33.976+00",FALSE
"b6b53561-dab5-4b9a-8d28-a7de1e4d1485","2020-01-31 18:37:33.864+00",FALSE
"b6b53561-dab5-4b9a-8d28-a7de1e4d1485","2020-01-31 18:38:34.767+00",TRUE
"b6b53561-dab5-4b9a-8d28-a7de1e4d1485","2020-01-31 18:39:36.076+00",TRUE
"b6b53561-dab5-4b9a-8d28-a7de1e4d1485","2020-01-31 18:40:29.776+00",FALSE
"b6b53561-dab5-4b9a-8d28-a7de1e4d1485","2020-01-31 18:41:35.579+00",FALSE
"b6b53561-dab5-4b9a-8d28-a7de1e4d1485","2020-01-31 18:42:34.274+00",TRUE
"b6b53561-dab5-4b9a-8d28-a7de1e4d1485","2020-01-31 18:43:23.842+00",TRUE
"b6b53561-dab5-4b9a-8d28-a7de1e4d1485","2020-01-31 18:44:35.08+00",FALSE
```

and I need to calculate how much time is been sunny. I know I need to use window functions but I’m stuck. Because I need this interval to be calculated in some range for example for last day. I have been able to make a query to this stage.

```SELECT
prev.station_id,
prev.rainy,
prev.measured_at AS started_at,
COALESCE(LEAD(prev.measured_at) OVER (ORDER BY prev.measured_at ASC), NOW()) AS ended_at,
(COALESCE(LEAD(prev.measured_at) OVER (ORDER BY prev.measured_at ASC), NOW()) - prev.measured_at) AS diff
FROM (
SELECT
m.station_id,
m.measured_at,
m.rainy,
COALESCE(LEAD(m.rainy) OVER (ORDER BY m.measured_at ASC), m.rainy) AS prev_rainy
FROM
z_measurements m
WHERE m.measured_at >= '2020-01-30T00:00:00.000Z'
ORDER BY m.measured_at ASC
) prev
WHERE prev.rainy IS DISTINCT FROM prev.prev_rainy
ORDER BY prev.measured_at ASC;
```

this query results in to:

```"station_id","rainy","started_at","ended_at","diff"
---------------------------------------------------
"b6b53561-dab5-4b9a-8d28-a7de1e4d1485",FALSE,"2020-01-31 18:37:33.864","2020-01-31 18:39:36.076+00","00:02:02.212"
"b6b53561-dab5-4b9a-8d28-a7de1e4d1485",TRUE,"2020-01-31 18:39:36.076","2020-01-31 18:41:35.579+00","00:01:59.503"
"b6b53561-dab5-4b9a-8d28-a7de1e4d1485",FALSE,"2020-01-31 18:41:35.579","2020-01-31 18:43:23.842+00","00:01:48.263"
"b6b53561-dab5-4b9a-8d28-a7de1e4d1485",TRUE,"2020-01-31 18:43:23.842","2020-01-31 21:18:04.89333+00","02:34:41.05133"
```

but if the first lane of this result is `rainy = False` the `start_date` should be the same as one in query `2020-01-30T00:00:00.000Z` (I’m assuming that if it’s sunny on the first result it has been sunny for whole time between my range start and this first recorded measurement) and also there is missing the last row because last measured state is sunny. So `started_at` for this last row should be `2020-01-31 18:44:35.08+00` and `end_at` should be `NOW()`.

I’m using postgresql 12.1.

```SELECT t.station_id, t.rainy, t.started_at, t.ended_at, t.ended_at - t.started_at AS diff
FROM (
SELECT
prev.station_id,
prev.rainy,
CASE
WHEN LAG(prev.measured_at) OVER measured_at_by_station_id IS NULL THEN '2020-01-30T00:00:00.000Z'
ELSE prev.measured_at
END AS started_at,
LEAD(prev.measured_at, 1, NOW()) OVER measured_at_by_station_id AS ended_at
FROM (
SELECT
m.station_id,
m.measured_at,
m.rainy,
LAG(m.rainy, 1, NOT(m.rainy)) OVER (PARTITION BY m.station_id ORDER BY m.measured_at ASC) AS prev_rainy
FROM z_measurements m
WHERE m.measured_at >= '2020-01-30T00:00:00.000Z'
ORDER BY m.station_id ASC, m.measured_at ASC
) prev
WHERE prev.rainy IS DISTINCT FROM prev.prev_rainy
WINDOW measured_at_by_station_id AS (PARTITION BY prev.station_id ORDER BY prev.measured_at ASC)
) t
ORDER BY t.station_id ASC, t.started_at ASC
```
5 People found this is helpful