Skip to content
Advertisement

Handling of generate_series() in queries with date or timestamp with / without time zone

I have a query to generate a report based on a date series that is grouped by date and employee_id. The date should be based on a particular time zone, in this case ‘Asia/Kuala_Lumpur’. But this can change depending on where the user’s time zone is.

Definition and sample data for table attendance:

Here is a fiddle to test.

The query does not include rows 8-10 in the output for the time zone Asia/Kuala_Lumpur +8, though it should. Result shows “rows” field 11,12.

How can I fix the query so that it generates the report based on dates of a given time zone? (Meaning I can change Asia/Kuala_Lumpur to America/New_York etc.)

I was told to do something like this:

But I am not sure how to apply it. Does not seem to work properly in this fiddle. It should include rows 8,9,10,11,12 but only rows 8,9,10 show up.

Advertisement

Answer

DB design

Consider some modifications to your setup:

Defining a PK makes it easier to aggregate rows, because the PK covers the whole row in the GROUP BY clause. See:

I wouldn’t use “name” as column name. It’s not descriptive. Every other column could be named “name”. Consider:

Query

db<>fiddle here

Note that my query outputs local date and time for Asia/Kuala_Lumpur:

Where to start? You need to understand the concepts of time zones and the Postgres data types timestamp with time zone (timestamptz) vs. timestamp without time zone (timestamp). Else, it will be confusion without end. Start here:

Most notably, timestamptz does not store a time zone:

When simply casting timestamptz to date or timestamp, the current time zone setting of the session is assumed. Not what you want. Provide a time zone explicitly with the AT TIME ZONE construct to avoid this pifall. In your fiddle you have both:

Also not doing what you want. After the (faulty!) cast to timestamp, the AT TIME ZONE construct translates the values back to timestamptz.

Also, your query generates the complete Cartesian Product of all users and and the maximum range of days in the the table attendance, only to reduce it back to a single day with:

The WHERE clause finally does what it’s supposed to do. But it makes no sense to first generate the full range of days, only to throw away most of it. (Seems you copied that from my other fiddle in the meantime?)

I commented out the WHERE clause and kept an optimized version of your generate_series() in my query as proof of concept. Further reading:

User contributions licensed under: CC BY-SA
10 People found this is helpful
Advertisement