DDL extraction issue with control characters
Closed this issue · 2 comments
arjun-hareendran commented
Hello
Just found some inconsitencey while extracting a DDL that make use of control characters.
Below was the DDL that as used to create the table
CREATE EXTERNAL TABLE db.tab
(
h_ls_hash array<string>,
ls_id STRING,
bin array<STRING>,
Class1 array<int>,
Class1_valueString array<string>,
Class1_valueFrom array<float>,
Class1_valueTo array<float>,
Class2 array<int>,
Class2_valueString array<string>,
Class2_valueFrom array<float>,
Class2_valueTo array<float>,
Class3 array<int>,
Class3_valueString array<string>,
Class3_valueFrom array<float>,
Class3_valueTo array<float>,
load_ts timestamp COMMENT 'EN: load timestamp | DE: Zeitstempel für das Laden des Datensatzes',
record_source string COMMENT 'EN: Source Name | DE: Quellenname'
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\u0001'
STORED AS TEXTFILE
location 'dbfs:/loc';
but when the DDL's were extracted using the migration tool it resulted in the below.
<div class="ansiout">CREATE EXTERNAL TABLE `db`.`tab`(`h_ls_hash` ARRAY<STRING>, `ls_id` STRING, `bin` ARRAY<STRING>, `Class1` ARRAY<INT>, `Class1_valueString` ARRAY<STRING>, `Class1_valueFrom` ARRAY<FLOAT>, `Class1_valueTo` ARRAY<FLOAT>, `Class2` ARRAY<INT>, `Class2_valueString` ARRAY<STRING>, `Class2_valueFrom` ARRAY<FLOAT>, `Class2_valueTo` ARRAY<FLOAT>, `Class3` ARRAY<INT>, `Class3_valueString` ARRAY<STRING>, `Class3_valueFrom` ARRAY<FLOAT>, `Class3_valueTo` ARRAY<FLOAT>, `load_ts` TIMESTAMP COMMENT 'EN: load timestamp | DE: Zeitstempel für das Laden des Datensatzes', `record_source` STRING COMMENT 'EN: Source Name | DE: Quellenname')
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
WITH SERDEPROPERTIES (
'field.delim' = '�',
'serialization.format' = '�'
)
STORED AS
INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION 'dbfs:/loc'
TBLPROPERTIES (
'transient_lastDdlTime' = '1606221712'
)
</div>
Can someone help me on this ?
mrchristine commented
@arjun-hareendran can you try with the new option here, --metastore-unicode
to verify your issue is resolved?
mrchristine commented
@arjun-hareendran were you able to get the new flag a try?