AbsaOSS/cobrix

PIC 9(11)v9(02) fields converting full of spaces into 0.00

Loganhex2021 opened this issue · 1 comments

Describe the bug

The hex value in the source file is 40404040404040404040404040, which is full of zeroes. But while reading the file , the spaces converted into 0.00. We are expecting null here.

Code snippet that caused the issue

  spark.read
   .format("cobol")

Expected behavior

A clear and concise description of what you expected to happen.

Context

  • Cobrix version:
  • Spark version:
  • Scala version:
  • Operating system:

Copybook (if possible)

         01  RECORD.
           05  FIELD1    PIC 9(11)V9(02).

Attach a small data file that can help reproduce the issue, if possible.

Read_options:
{'encoding': 'EBCDIC', 'ebcdic_code_page': 'cp037', 'string_trimming_policy': 'none', 'debug_ignore_file_size': 'true'}

@yruslan , Could you please guide us to fix this issue

There is an option ('improved_null_detection') that improves the behavior of null detection. Since it is a breaking change it is turned off by default. But even when it is turned on the issue you specified takes place. Will fix.

In 2.5.0 the option will be turned on by default.