Skip to content
Advertisement

How to get a first and last value of one column based on another column values

I have a data looks like below.

I want to extract the value of the first and last “TS” column based on each “Col” column values (A, B, and C) when it changes. The expected output should be as follows

Thanks for your help in advance!

Advertisement

Answer

This is a type of gaps-and-islands problem. This version is probably best addressed using the difference of row numbers:

This includes the col value on each row, which seems very useful.

EDIT:

If you have the situation where you have duplicates timestamps in the data, you can use dense_rank() rather than row_number().

User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement