I have a table with different species and their occurences (timestamp) from field survey:
species | timestamp A | 2021-04-01T12:35 A | 2021-04-01T12:36 B | 2021-04-01T12:36 A | 2021-04-01T12:37 A | 2021-04-01T12:39 C | 2021-04-01T12:40 B | 2021-04-01T12:42 C | 2021-04-01T12:44 C | 2021-04-01T12:44 B | 2021-04-01T12:47
Now I want to count them, not in total but in chunks of 10 minutes each, where only the first occurence is counted. That means between 12:31 and 12:40 there are species A,B,C. Between 12:41 and 12:50 there are species B,C.
For a single 10 minute chunk I can do something like
SELECT DISTINCT(species) FROM table WHERE timestamp IS <condition>
But what I need is something like:
chunk_start_time | chunk_end_time | species 2021-04-01T12:31 | 2021-04-01T12:40 | A 2021-04-01T12:31 | 2021-04-01T12:40 | B 2021-04-01T12:31 | 2021-04-01T12:40 | C 2021-04-01T12:41 | 2021-04-01T12:50 | B 2021-04-01T12:41 | 2021-04-01T12:50 | C
My timestamp has no seconds. That’s the reason why it is x1 to x0 minutes. How can I do the math in SQLite or should I better use Python pandas for that?
Advertisement
Answer
Use datetime function strftime()
to produce the chunks:
SELECT DISTINCT strftime( '%Y-%m-%dT%H:%M', timestamp, '-' || CASE WHEN timestamp LIKE '%0' THEN 9 ELSE SUBSTR(timestamp, -1) - 1 END || ' minute' ) chunk_start_time, strftime( '%Y-%m-%dT%H:%M', timestamp, '+' || CASE WHEN timestamp LIKE '%0' THEN 0 ELSE 10 - SUBSTR(timestamp, -1) END || ' minute' ) chunk_end_time, species FROM tablename ORDER BY chunk_start_time, species;
See the demo.