Developers Club geek daily blog

1 year, 6 months ago
The subject of work with time marks in PostgreSQL is badly opened in Russian-language profile publications on the Internet and is a frequent source of problems in work of programmers. I bring to your attention transfer of material from Hubert Lubaczewski, the author of the popular foreign blog depesz.com. I hope, article will be useful to you!

image

From time to time in IRC or in mailing groups somebody asks questions which show deep misunderstanding (or a lack of understanding) of time stamps, especially those which consider time zones. As I already faced it earlier, allow me to tell that such timestamps as to work with them and what most widespread hitches you can face.

We have two data types which we can use:

  • timestamp
  • timestamp with the time zone (or timestamptz)

The timestamp type contains only date and time, any other information. On the one hand, can seem that it is quite good (eventually, in everyday life we use only date and time, isn't it?), but with another is a most complete horror.

Let's provide that you have a time mark "2014-04-04 8:00:00 PM". What she speaks to you about? Unfortunately, not about much. Everything depends on about what point of the planet there is a speech. Eight evenings on April 4th is a different timepoint in Los Angeles, Chicago, London, Warsaw or Moscow. In it a problem of time zones.

Of course, you can think: "I will always be in one time zone, I do not need troubles with support of different temporary zones. In my time zone of date and time will be quite enough to note any timepoint, quite so we do in "real life".

But whether so it actually?

Let's provide that you have a tag ‘2013-10-27 2:00:00 AM', and you know that your application is attached to the Polish time. In this case, you were not lucky any more because it can be 2 o'clock in the morning on the Central European daylight saving time (CEST) or is one hour more, on normal Central European time. In total because of seasonal time conversion.

I consider that use of time marks without time zone almost always is a bug, and it needs to be corrected. Problems become even more if, entries in your application arrive from different time zones (for example, the application scheduler).

So the most obvious solution – to use time stamps with time zones (timestamptz).

First, it will not figure prominently on a disk:

$ select typname, typlen from pg_type where typname ~ '^timestamp';
   typname   | typlen 
-------------+--------
 timestamp   |      8
 timestamptz |      8
(2 rows)

How it works? The tag has to know the time zone, so why more place for this purpose is not required?

The matter is that she does not know the time zone. Inside, all values in the columns timestamptz are specified in the UTC format (coordinated universal time).

UTC have pleasant features: it has no shift (he is a starting point from which shifts of other time zones are considered), and it has no difference between daylight and winter saving time. So any time mark in the UTC format always with guarantee indicates only one point in time.

But if all the time to specify on UTC, then how I learn time in the time zone necessary to me?

Every time, so far as concerns timestamptz values if the time zone is not specified, then PostgreSQL uses in advance configured time. And you can configure it by different methods:

  • the timezone parameter in postgresql.conf
  • alter database … set timezone =‘ …'
  • alter user … set timezone =‘ …'
  • SET timezone = ‘…'

The first parameter is applied to specify in what time zone there is your server. In other words, the time zone by default which will be used in the absence of other changes.

The following two methods change value by default for the selected database and the user.

The last method can be used if you want that your connection with the database worked with other settings.

Take into account as at the same time now output changes ():

$ select now();
              now              
-------------------------------
 2014-04-04 20:32:59.390583+02
(1 row)
 
$ set timezone = 'America/New_York';
SET
 
$ select now();
              now              
-------------------------------
 2014-04-04 14:33:06.442768-04
(1 row)

So every time when you browse or you change values timestamptz, PostgreSQL converts them from UTC.

It means that values can be compared easily (all of them in one time zone, are not present shifts for daylight or winter saving time so comparison is always possible).

And now let's provide that you have a 2014-04-04 8:00:00 PM value. And you know that this time in Los Angeles, but you want to save it in the database which functions in other time zone. You can check that the current shift makes-7 hours and to use value thus:

$ select '2014-04-04 20:00:00-07'::timestamptz;
      timestamptz       
------------------------
 2014-04-05 05:00:00+02

What has happened? Why the 8th evening is not shown?

The reason is simple – I inserted timestamp in some time zone into request. Inside, the tag was converted in UTC, and then, again is converted (perhaps, even without UTC, I am not sure) to my normal time zone which is:

$ show timezone;
 TimeZone 
----------
 Poland
(1 row)

If I set the time zone of Los Angeles, then the result of request would be such:

$ set timezone = 'America/Los_Angeles';
SET
 
$ select '2014-04-04 20:00:00-07'::timestamptz;
      timestamptz       
------------------------
 2014-04-04 20:00:00-07
(1 row)

It is important to understand that the displayed value always takes setup of the time zone into account.

There is one more method to receive 20:00 in Los Angeles:

$ set timezone = 'Poland';
SET
 
$ select '2014-04-04 20:00:00'::timestamp at time zone 'America/Los_Angeles';
        timezone        
------------------------
 2014-04-05 05:00:00+02
(1 row)

It is very important to add ":: timestamp" after value, differently we will receive something strange:

$ set timezone = 'Poland';
SET
 
$ select '2014-04-04 20:00:00' at time zone 'America/Los_Angeles';
      timezone       
---------------------
 2014-04-04 11:00:00
(1 row)

What here occurred? From where 11:00 undertook?

Value in quotes (2014-04-04 8:00:00 PM) is perceived as timestamptz that means the 8th evenings in my time zone:

select '2014-04-04 20:00:00'::timestamptz;
      timestamptz       
------------------------
 2014-04-04 20:00:00+02
(1 row)

And only after transfer of value to my time zone PG reads out "at time zone …" which is used for time display in the selected time zone.

Thus, timestamp at time zone issues timestamptz value which shows the moment when local time in the selected time zone was such as it is specified in command.

And timestamptz at time zone issues timestamp value which shows what was time in the selected time zone in the specified timepoint.

It sounds a little confusedly therefore give I I will give examples:

select '2014-04-04 20:00:00'::timestamptz at time zone 'UTC';
      timezone       
---------------------
 2014-04-04 18:00:00
(1 row)
 
select '2014-04-04 20:00:00'::timestamp at time zone 'UTC';
        timezone        
------------------------
 2014-04-04 22:00:00+02
(1 row)

Interesting the fact that we can use it for time conversion from one time zone in another even if Pg is not in one of them.

Let's say we want to learn, what time is it now in Los Angeles when in Moscow — the 8th mornings. My local time following:

$ show timezone;
 TimeZone 
----------
 Poland
(1 row)

It is not enough advantage of it.

For a start we need to define a point in time (in the timestamptz format) which shows the 8th mornings in Moscow:

$ select '2014-04-04 08:00:00'::timestamp at time zone 'Europe/Moscow';
        timezone        
------------------------
 2014-04-04 06:00:00+02
(1 row)

It says to me that it corresponds the 6th mornings in my time zone. But we want to learn time in Los Angeles. I could write ‘2014-04-04 6:00:00 '''' in the time zone of 'LA', but it is possible to make in a different way:

$ select ('2014-04-04 08:00:00'::timestamp at time zone 'Europe/Moscow') at time zone 'America/Los_Angeles';
      timezone       
---------------------
 2014-04-03 21:00:00
(1 row)

As expression ‘timestamp at time zone.' is the same that timestamptz, we can use "at time zone" once again to transfer it back to the time stamp (without indication of the time zone) relating to some other place.

I hope, now everything is clear to you. I long enough tried to understand this question, and at last understood everything :)

All this has one interesting ghost effect: it is not so simple to add indexes to the functions working with timestamptz. For example, you cannot create an index which will be used for receipt of a day of the week:

$ create table test (i timestamptz);
CREATE TABLE
 
$ create index q on test (to_char(i, 'Day'));
ERROR:  functions in index expression must be marked IMMUTABLE

As shown in an example above, the reason is very simple – the same point in time can belong to different days of the week depending on the time zone. And as to_char () uses the current time zone, it can issue different values for the same basic data depending on settings of the time zone in system:

$ set timezone = 'Europe/Warsaw';
SET
 
$ insert into test (i) values ('2014-04-04 06:00:00');
INSERT 0 1
 
$ select i, to_char(i, 'Day') from test;
           i            |  to_char  
------------------------+-----------
 2014-04-04 06:00:00+02 | Friday   
(1 row)
 
$ set timezone = 'Europe/Moscow';
SET
 
$ select i, to_char(i, 'Day') from test;
           i            |  to_char  
------------------------+-----------
 2014-04-04 08:00:00+04 | Friday   
(1 row)
 
$ set timezone = 'America/Los_Angeles';
SET
 
$ select i, to_char(i, 'Day') from test;
           i            |  to_char  
------------------------+-----------
 2014-04-03 21:00:00-07 | Thursday 
(1 row)

The same point in time, but different days. It can be different months or even different years, depending on where it was.

The time mark (without time zone) "shows" strength here – as in it the time zone is not specified, it can be used quietly for information extraction.

But we know how to transfer timestamptz to timestamp. It is necessary just to specify by it the time zone. Therefore we can try to make so:

create index q on test (to_char(i at time zone 'Poland', 'Day'));

But, unfortunately, nothing leaves. The matter is that to_char too versatile. You can use to_char here so:

$ select to_char(now(), 'TMMonth');
 to_char 
---------
 April
(1 row)
 
$ set lc_time = 'pl_PL.UTF-8';
SET
 
$ select to_char(now(), 'TMMonth');
 to_char  
----------
 Kwiecień
(1 row)

This time we receive other results not because of the time zone, and because of a locale.

The correct solution of the problem of indexing will be to write own function which will cause to_char in absolutely permanent "environment", and then already to index it. Here so:

create function day_from_ts(timestamptz) returns text as $$
select to_char( $1 at time zone 'Poland', 'Day' );
$$ language sql immutable;
CREATE FUNCTION

And now we can use it for indexing:

create index q on test (day_from_ts( i ));
CREATE INDEX

It is safe because function forces the time zone to accept Poland value, and it causes to_char so that to ignore value of a locale (in other words, in the to_char format there is no TM prefix).

It is natural to use this index, all requests have to use this function also:

select * from test where day_from_ts(i) = 'Friday';

One more important thing during the work with time zones – receipt of time of Unix, or a so-called era. In general, it is simple:

$ select extract(epoch from now());
    date_part     
------------------
 1396638868.57491
(1 row)

Interesting the fact that it does not depend on the time zone:

$ begin;
BEGIN
 
$ show timezone;
 TimeZone 
----------
 Poland
(1 row)
 
$ select now(), extract(epoch from now());
              now              |    date_part     
-------------------------------+------------------
 2014-04-04 21:15:27.834775+02 | 1396638927.83477
(1 row)
 
$ set timezone = 'America/Los_Angeles';
SET
 
$ select now(), extract(epoch from now());
              now              |    date_part     
-------------------------------+------------------
 2014-04-04 12:15:27.834775-07 | 1396638927.83477
(1 row)
 
$ commit;
COMMIT

The reason known not for everything is covered that Unix time is always accepted in the time zone of UTC. It means that when you retrieve an era from a time mark of timestamp, PG assumes that it is in UTC. What the following potential problems follow from:

$ select now(), extract(epoch from now());
              now              |    date_part    
-------------------------------+-----------------
 2014-04-04 21:19:01.456205+02 | 1396639141.4562
(1 row)
 
$ select extract(epoch from '2014-04-04 21:19:01.456205'::timestamp);
    date_part    
-----------------
 1396646341.4562
(1 row)

In the first case of Pg receives "a point in time" which is internally converted into UTC (and when it is displayed – it is transformed to my time zone, +2).

In the second case the time mark is in my time zone, but it is supposed that it is UTC (without converting!), and an era undertakes from 2014-04-04 21:19:01 UTCUTCUTC ' ' value, but not ‘2014-04-04 21:19:01' ''' ''' ''''.

Subtly.

To put it briefly, try to avoid timestamp and use timestamptz.

The last what I would like to tell about is not the bug or a potential problem, and it is rather a functionality about which many do not know.

As you saw, PostgreSQL uses timestamp (and timestamptz) with an accuracy of microseconds. Many people insist on that accuracy was only about one second though personally it is not pleasant to me.

Both timestamp, and timestamptz (and other types of data relating at the right time) can have the additional accuracy ("precision").

Let me give a simple example:

$ select now(), now()::timestamptz(0), now()::timestamptz(1);
              now              |          now           |           now            
-------------------------------+------------------------+--------------------------
 2014-04-04 21:23:42.322315+02 | 2014-04-04 21:23:42+02 | 2014-04-04 21:23:42.3+02
(1 row)

Of course, you can use it and in tables:

$ create table test (i timestamptz(0));
CREATE TABLE
 
$ insert into test(i) values (now());
INSERT 0 1
 
$ select * from test;
           i            
------------------------
 2014-04-04 21:24:16+02
(1 row)

It's cool! You do not need to change "now ()" or something else, just add accuracy to data type, and it will correct everything.

I mentioned that it is not pleasant to me. The reason is simple – in any rather loaded system second – too low level of accuracy. Especially as data storage with an accuracy of microsecond does not cost anything to me, but can be useful. On the other hand, if data to microseconds, then how to me to make that values were displayed without fractions of a second?

Everything is simple: I use (in Select queries) to_char funution (), either date_trunc, or even reduction to the timestamptz (0) type:

$ select now(),
    to_char(now(), 'YYYY-MM-DD HH24:MI:SS TZ'),
    date_trunc('second', now()),
    now()::timestamptz(0);
              now              |         to_char          |       date_trunc       |          now           
-------------------------------+--------------------------+------------------------+------------------------
 2014-04-04 21:28:20.827763+02 | 2014-04-04 21:28:20 CEST | 2014-04-04 21:28:20+02 | 2014-04-04 21:28:21+02
(1 row)


In more detail about how to work with timestamps, we are going to tell on the PG Day conference' 16 Russia in July, 2016! Prepare the questions, we will try to answer them.

This article is a translation of the original post at habrahabr.ru/post/273177/
If you have any questions regarding the material covered in the article above, please, contact the original author of the post.
If you have any complaints about this article or you want this article to be deleted, please, drop an email here: sysmagazine.com@gmail.com.

We believe that the knowledge, which is available at the most popular Russian IT blog habrahabr.ru, should be accessed by everyone, even though it is poorly translated.
Shared knowledge makes the world better.
Best wishes.

comments powered by Disqus