matthijs/sqlpp11-connector-postgresql

Timezones not handled properly

dgel opened this issue · 1 comments

dgel commented

(I've updated to the latest master for these tests)

When inserting a timestamp directly, timezones are not kept into account regardless of the column type in the db. What's more, the date is explicitly inserted as a timestamp, thus inserting what is a UTC time stamp directly into a table that might be using the 'Europe/Berlin' timezone. E.g. if both the machine and the db use that timezone during daylight savings time, at 14:00, it will insert 12:00, which will be stored as 12:00+02, or 10:00 UTC.

During Select, such a column gets manipulated, but wrongly, calculating one timestamp that is wrong, and returning another that is also wrong.

Prepared statements also do this calculating when retrieving a datetime, but on inserting do additional work in an attempt to insert the correct timestamp, however it seems to apply the local time timezone offset without changing the actual time. (i.e. 12:00UTC is inserted as 12:00+01 instead of 13:00+01)

See below a program that shows some insertions that go wrong. Set the timezone of the test database to a different one from the local machine to fully see the issue.

#include "tables/public/dates.h"

#include <chrono>
#include <iostream>
#include <sqlpp11/postgresql/postgresql.h>
#include <sqlpp11/sqlpp11.h>

#include <string>

#include <date/date.h>
#include <date/tz.h>

int main()
{
	auto conf = std::make_shared<sqlpp::postgresql::connection_config>();
	conf->host = "127.0.0.1";
	conf->user = "test";
	conf->password = "test";
	conf->dbname = "test";
  conf->debug = true;

	auto connection = sqlpp::postgresql::connection(conf);

	connection.execute(R"(DROP TABLE IF EXISTS dates;)");
	connection.execute(R"(CREATE TABLE dates
               (
                 id VARCHAR(40) NOT NULL,
			     time TIMESTAMP WITH TIME ZONE
               ))");

  using namespace date;
  using namespace std::chrono;

  // 2pm on August 1st 2019
  std::chrono::system_clock::time_point insert_time1 = sys_days{2019_y/August/1} + 14h;
  std::chrono::system_clock::time_point insert_time2 = sys_days{2019_y/November/1} + 14h;

	model::dates dates;
	connection(
		sqlpp::insert_into(dates)
		.set(
			dates.id = "direct_daylight_saving",
			dates.time = date::floor<std::chrono::microseconds>(insert_time1)));
	connection(
		sqlpp::insert_into(dates)
		.set(
			dates.id = "directl_no_daylight_saving",
			dates.time = date::floor<std::chrono::microseconds>(insert_time2)));

	auto prepared = connection.prepare(sqlpp::insert_into(dates).set(dates.id = sqlpp::parameter(dates.id),
		dates.time = sqlpp::parameter(dates.time)));
	prepared.params.id = "prepared_daylight_saving";
	prepared.params.time = date::floor<std::chrono::microseconds>(insert_time1);
	connection(prepared);
	prepared.params.id = "prepared_no_daylight_saving";
	prepared.params.time = date::floor<std::chrono::microseconds>(insert_time2);
	connection(prepared);

	using namespace date;
	std::cout << "original_daylight_saving: " << insert_time1 << std::endl;
	std::cout << "original_no_daylight_saving: " << insert_time2 << std::endl;
	for (const auto& row : connection(sqlpp::select(dates.id, dates.time).from(dates).unconditionally()))
	{
		std::cout << row.id.value() << ": " << row.time.value() << std::endl;
	}
}

Proposal:

If serialization of normal insertions can be specialized for postgresql, I recommend always inserting utc directly, without specifying the column type. Postgresql will ignore the timezone specification for normal timestamp columns, and parse it correctly for timestamp with time zone columns. E.g. insert VALUES ('2019-08-01 12:00:00+00') instead of VALUES (TIMESTAMP '2019-08-01 12:00:00')

When selecting data, postgresql will automatically add the right timezone offset for the database, for example returning '2019-08-01 14:00:00+02'. All that needs to be done when binding results is subtract the timezone offset if it is present.

It might be nice to require the timezone module for hinnant's date library as well, and map 'timestamp' to local_time and 'timestamp with time zone' to sys_time

dgel commented

Thinking about it a bit more, the current issues could also be fixed by using the timezone module, and for any date we insert obtain a sys_info containing the timezone offset for that date, then insert the modified time point along with its offset. On timestamptz columns, postgresql will calculate the resulting time point in the database's time zone, and on timestamp columns, the timezone will be inserted as is. So the first will contain values in the database's time zone, and the second in the OS's timezone. Then when binding results, if the result has a timezone you simply subtract it, and otherwise you subtract the offset for that date listed in date::sys_info.

This would properly handle both column types, whereas the proposal above would always insert UTC for timestamp columns.

I have a branch implementing the first proposal, as I'm currently only using timestamptz columns, if you're interested I could open a PR