Skip to content
Advertisement

Get distinct values within time chunks

I have a table with different species and their occurences (timestamp) from field survey:

species | timestamp
A       | 2021-04-01T12:35
A       | 2021-04-01T12:36
B       | 2021-04-01T12:36
A       | 2021-04-01T12:37
A       | 2021-04-01T12:39
C       | 2021-04-01T12:40
B       | 2021-04-01T12:42
C       | 2021-04-01T12:44
C       | 2021-04-01T12:44
B       | 2021-04-01T12:47

Now I want to count them, not in total but in chunks of 10 minutes each, where only the first occurence is counted. That means between 12:31 and 12:40 there are species A,B,C. Between 12:41 and 12:50 there are species B,C.

For a single 10 minute chunk I can do something like

SELECT DISTINCT(species) 
FROM table
WHERE timestamp IS <condition>

But what I need is something like:

chunk_start_time | chunk_end_time   | species 
2021-04-01T12:31 | 2021-04-01T12:40 | A       
2021-04-01T12:31 | 2021-04-01T12:40 | B       
2021-04-01T12:31 | 2021-04-01T12:40 | C       
2021-04-01T12:41 | 2021-04-01T12:50 | B       
2021-04-01T12:41 | 2021-04-01T12:50 | C       

My timestamp has no seconds. That’s the reason why it is x1 to x0 minutes. How can I do the math in SQLite or should I better use Python pandas for that?

Advertisement

Answer

Use datetime function strftime() to produce the chunks:

SELECT DISTINCT
       strftime(
         '%%Y-%%m-%%dT%%H:%%M', 
         timestamp, 
         '-' || CASE WHEN timestamp LIKE '%%0' THEN 9 ELSE SUBSTR(timestamp, -1) - 1 END || ' minute'
       ) chunk_start_time,
       strftime(
         '%%Y-%%m-%%dT%%H:%%M', 
         timestamp, 
         '+' || CASE WHEN timestamp LIKE '%%0' THEN 0 ELSE 10 - SUBSTR(timestamp, -1) END || ' minute'
       ) chunk_end_time,
       species
FROM tablename
ORDER BY chunk_start_time, species;

See the demo.

User contributions licensed under: CC BY-SA
4 People found this is helpful
Advertisement