linkedin/coral

Questions about implicit conversion when hiveSQL is converted to trinoSQL

maidangdang44 opened this issue · 1 comments

I created a Hive table as follow, and insert two values:

create table test_db.coral_test
(
    `id` int,
    `name` string
) STORED AS TEXTFILE;

insert into test_db.coral_test values(1, "1");
insert into test_db.coral_test values(2, "name2");

It can be successful when running select count(*) from test_db.coral_test where name = 1 with Hive

I use coral to convert this HiveSQL to TrinoSQL and get SELECT COUNT(*) FROM "test_db"."coral_test" AS "coral_test" WHERE CAST("coral_test"."name" AS INTEGER) = 1

But when I use this TrinoSQL to run, I get the following error

trino> SELECT COUNT(*) FROM "test_db"."coral_test" AS "coral_test" WHERE CAST("coral_test"."name" AS INTEGER) = 1;

Query 20230421_092946_00031_gbeeb, FAILED, 2 nodes
Splits: 11 total, 1 done (9.09%)
0.23 [1 rows, 4B] [4 rows/s, 17B/s]

Query 20230421_092946_00031_gbeeb failed: Cannot cast 'name2' to INT

How can I use coral to get TrinoSQL running correctly?

here is my java code:

        HiveConf hiveConf = new HiveConf();
        hiveConf.set("hive.metastore.uris", "thrift://ip:port");
        HiveMetaStoreClient hiveMetaStoreClient = new HiveMetaStoreClient(hiveConf);
        HiveMscAdapter hiveMscAdapte = new HiveMscAdapter(hiveMetaStoreClient);
        HiveToRelConverter hiveToRealConverter = new HiveToRelConverter(hiveMscAdapte);
        
        String hiveSQL = "select count(*) from test_db.coral_test where name = 1";
        RelNode relNode = hiveToRealConverter.convertSql(hiveSQL);
        RelToTrinoConverter relToTrinoConverter = new RelToTrinoConverter();
        String trinoSQL = relToTrinoConverter.convert(relNode);
        <dependency>
            <groupId>com.linkedin.coral</groupId>
            <artifactId>coral-trino</artifactId>
            <version>2.0.153</version>
        </dependency>

the ideal translation in this case should be:
SELECT * FROM tmpissue WHERE TRY_CAST(name AS INTEGER) = 1 AND TRY_CAST(name AS INTEGER) IS NOT NULL

Coral should generate the try_cast() operator instead & introduce a null check, instead of a CAST operator. I'm going to tag this issue as a CoralIR issue.

cc: @wmoustafa