
Bug: `FromUnixtimeOperatorTransformer` possibly built on false supposisions.

findinpath opened this issue · 2 comments

The FromUnixtimeOperatorTransformer from
seems to be built on false premises.

The function returns a timestamp(3) with time zone

However, the FromUnixtimeOperatorTransformer transforms from_unixtime expressions to
FORMAT_DATETIME(FROM_UNIXTIME(10000), 'yyyy-MM-dd HH:mm:ss') which is a varchar
This is causing the failure of statements like the following:

      "at_timezone" (
        "format_datetime" (
          "from_unixtime" (
              "test_from_utc_timestamp_source"."source_float" AS DOUBLE
          'yyyy-MM-dd HH:mm:ss'
        "$canonicalize_hive_timezone_id" ('America/Los_Angeles')
      ) AS TIMESTAMP (3)
    ) AS VARCHAR (65535)
  ) AS "ts_float"
from test_from_utc_timestamp_source 

with the exception:

Caused by: io.trino.spi.TrinoException: Unexpected parameters (varchar, varchar) for function at_timezone. Expected: at_timezone(timestamp(p) with time zone, interval day to second), at_timezone(timestamp(p) with time zone, varchar(x))

Related PR: #426

Steps to reproduce the issue


CREATE TABLE test_from_utc_timestamp_source (source_float float);
CREATE VIEW test_from_utc_timestamp_view AS SELECT CAST(from_utc_timestamp(source_float, 'America/Los_Angeles') AS STRING) ts_float FROM test_from_utc_timestamp_source;

select * from test_from_utc_timestamp_view;

-- 1970-01-30 21:30:00


insert into hive.default.test_from_utc_timestamp_source values (2592000.0);

trino version 420 while using coral version 2.2.9

SELECT * FROM hive.default.test_from_utc_timestamp_view;
Query 20231013_184405_00000_fa5un failed: line 1:15: Failed analyzing stored view 'hive.default.test_from_utc_timestamp_view': line 1:18: Unexpected parameters (varchar, varchar) for function at_timezone. Expected: at_timezone(timestamp(p) with time zone, varchar(x)), at_timezone(timestamp(p) with time zone, interval day to second)

trinoSql in ViewReaderUtil

SELECT CAST(CAST("at_timezone"("format_datetime"("from_unixtime"(CAST("test_from_utc_timestamp_source"."source_float" AS DOUBLE)), 'yyyy-MM-dd HH:mm:ss'), "$canonicalize_hive_timezone_id"('America/Los_Angeles')) AS TIMESTAMP(3)) AS VARCHAR(65535)) AS "ts_float"
FROM "default"."test_from_utc_timestamp_source" AS "test_from_utc_timestamp_source"

trino version 420 while using coral version 2.1.5

trino> SELECT * FROM hive.default.test_from_utc_timestamp_view;
 1970-01-30 16:00:00.000 

trinoSql in ViewReaderUtil

SELECT CAST(CAST("at_timezone"("from_unixtime"(CAST("test_from_utc_timestamp_source"."source_float" AS DOUBLE)), "$canonicalize_hive_timezone_id"('America/Los_Angeles')) AS TIMESTAMP(3)) AS VARCHAR(65535)) AS "ts_float"
FROM "default"."test_from_utc_timestamp_source" AS "test_from_utc_timestamp_source"





format_datetime shouldn't be there.

FromUtcTimestampOperatorTransformer is playing with both:

  • from_unixtime_nanos
  • from_unixtime

from_unixtime_nanos is a Trino specific function

Translation of from_unixtime_nanos works as expected.

from_unixtime is:

from_unixtime(bigint unixtime[, string pattern])

Converts a number of seconds since epoch (1970-01-01 00:00:00 UTC) to a string representing the timestamp of that moment in the current time zone(using config "") using the specified pattern. If the pattern is missing the default is used ('uuuu-MM-dd HH:mm:ss' or yyyy-MM-dd HH:mm:ss'). Example: from_unixtime(0)=1970-01-01 00:00:00 (

Given the fact that from_unixtime is a Hive function, it gets its own FromUnixtimeOperatorTransformer and that's where our problem actually occurs.
When we transform from_utc_timestamp via FromUtcTimestampOperatorTransformer we're creating (in case of dealing with floats) a nested from_unixtime call which gets unintentionally transformed as well to "at_timezone"("from_unixtime" .