I need help removing duplicate time zone records from my select query without deleting them. The current result is as follows:
Employee ID # Presence Presence Start Time Presence End Time GMT Presence Start Time 691 Out Of Office 2020-02-01 04:30:00 2020-02-01 14:30:00 2020-02-01 12:30:00 691 Out Of Office 2020-02-01 05:30:00 2020-02-01 15:30:00 2020-02-01 12:30:00 691 Out Of Office 2020-02-01 07:30:00 2020-02-01 17:30:00 2020-02-01 12:30:00 691 Out Of Office 2020-02-01 13:30:00 2020-02-01 23:30:00 2020-02-01 12:30:00 691 Out Of Office 2020-02-01 20:30:00 2020-02-02 06:30:00 2020-02-01 12:30:00 435 Out Of Office 2020-02-01 00:15:00 2020-02-01 09:00:00 2020-01-31 16:15:00 5681 Out Of Office 2020-02-02 07:00:00 2020-02-02 15:45:00 2020-02-02 15:00:00 5681 Out Of Office 2020-02-02 08:00:00 2020-02-02 16:45:00 2020-02-02 15:00:00 5681 Out Of Office 2020-02-02 10:00:00 2020-02-02 18:45:00 2020-02-02 15:00:00 5681 Out Of Office 2020-02-02 16:00:00 2020-02-03 00:45:00 2020-02-02 15:00:00 5681 Out Of Office 2020-02-02 23:00:00 2020-02-03 07:45:00 2020-02-02 15:00:00 1927 Out Of Office 2020-02-02 07:00:00 2020-02-02 18:15:00 2020-02-02 15:00:00 1927 Out Of Office 2020-02-02 08:00:00 2020-02-02 19:15:00 2020-02-02 15:00:00 1927 Out Of Office 2020-02-02 10:00:00 2020-02-02 21:15:00 2020-02-02 15:00:00 1927 Out Of Office 2020-02-02 16:00:00 2020-02-03 03:15:00 2020-02-02 15:00:00 1927 Out Of Office 2020-02-02 23:00:00 2020-02-03 10:15:00 2020-02-02 15:00:00
The table returns duplicate GMT start times for the same employee, the database appears to be duplicating the results based on different time zones.
I just want to remove the duplicate GMT Presence Start Times
Employee ID # 691 should have 1 row, same with 5681 and 1927. Can someone please help?
Advertisement
Answer
You can use a GROUP BY
clause to give unique values of GMT Presence Start Time
for each employees Presence
in the log. You need to use an aggregation function on the Presence Start Time
and Presence End Time
functions; I’ve chosen MIN
in my example, but you might want something else.
SELECT [Employee ID #], [Presence], MIN([Presence Start Time]), MIN([Presence Start Time]), [GMT Presence Start Time] FROM data GROUP BY [Employee ID #], [Presence], [GMT Presence Start Time]
Output (for your sample data):
Employee ID # Presence Presence Start Time Presence Start Time GMT Presence Start Time 435 Out Of Office 2020-02-01 00:15:00 2020-02-01 09:00:00 2020-01-31 16:15:00 691 Out Of Office 2020-02-01 04:30:00 2020-02-01 14:30:00 2020-02-01 12:30:00 1927 Out Of Office 2020-02-02 07:00:00 2020-02-02 18:15:00 2020-02-02 15:00:00 5681 Out Of Office 2020-02-02 07:00:00 2020-02-02 15:45:00 2020-02-02 15:00:00
Alternatively you can use window functions on the Presence
Start and End times:
SELECT DISTINCT [Employee ID #], [Presence], FIRST_VALUE([Presence Start Time]) OVER (PARTITION BY [Employee ID #], [Presence], [GMT Presence Start Time] ORDER BY [Presence Start Time]) AS [Presence Start Time], FIRST_VALUE([Presence End Time]) OVER (PARTITION BY [Employee ID #], [Presence], [GMT Presence Start Time] ORDER BY [Presence End Time]) AS [Presence End Time], [GMT Presence Start Time] FROM data
Output:
Employee ID # Presence Presence Start Time Presence End Time GMT Presence Start Time 435 Out Of Office 2020-02-01 00:15:00 2020-02-01 09:00:00 2020-01-31 16:15:00 691 Out Of Office 2020-02-01 04:30:00 2020-02-01 14:30:00 2020-02-01 12:30:00 1927 Out Of Office 2020-02-02 07:00:00 2020-02-02 18:15:00 2020-02-02 15:00:00 5681 Out Of Office 2020-02-02 07:00:00 2020-02-02 15:45:00 2020-02-02 15:00:00