frictionlessdata/tabulator-py

Why is the tsv.FinalBackslashInFieldIs Forbidden?

SunYanCN opened this issue · 6 comments

I have a tsv file named '1.tsv' like this.
1 2\ 1

It is 1TAB2\TAB1 .

when I use this code:

if __name__ == '__main__':
    input_file = '1.tsv'
    with Stream(input_file) as stream:
        for row in stream:
            print(row)

I get a error.

Traceback (most recent call last):
  File "/home/band/band/base.py", line 168, in <module>
    with Stream(input_file) as stream:
  File "/root/anaconda3/envs/tf20/lib/python3.6/site-packages/tabulator/stream.py", line 163, in __enter__
    self.open()
  File "/root/anaconda3/envs/tf20/lib/python3.6/site-packages/tabulator/stream.py", line 275, in open
    self.__extract_sample()
  File "/root/anaconda3/envs/tf20/lib/python3.6/site-packages/tabulator/stream.py", line 474, in __extract_sample
    row_number, headers, row = next(self.__parser.extended_rows)
  File "/root/anaconda3/envs/tf20/lib/python3.6/site-packages/tabulator/parsers/tsv.py", line 64, in __iter_extended_rows
    for row_number, item in enumerate(items, start=1):
  File "/root/anaconda3/envs/tf20/lib/python3.6/site-packages/tsv.py", line 44, in un
    columns = next(rows, None)
  File "/root/anaconda3/envs/tf20/lib/python3.6/site-packages/tsv.py", line 66, in parse_lines
    values = parse_line(line)
  File "/root/anaconda3/envs/tf20/lib/python3.6/site-packages/tsv.py", line 73, in parse_line
    return [parse_field(s) for s in line.split('\t')]
  File "/root/anaconda3/envs/tf20/lib/python3.6/site-packages/tsv.py", line 73, in <listcomp>
    return [parse_field(s) for s in line.split('\t')]
  File "/root/anaconda3/envs/tf20/lib/python3.6/site-packages/tsv.py", line 98, in parse_field
    raise FinalBackslashInFieldIsForbidden
tsv.FinalBackslashInFieldIsForbidden
def parse_field(s):
    o = ''
    if s == '\\N':
        return None
    before, sep, after = s.partition('\\')
    while sep != '':
        o += before
        if after == '':
            raise FinalBackslashInFieldIsForbidden
        if after[0] in escapes:
            o += escapes[after[0]]
            before, sep, after = after[1:].partition('\\')
        else:
            before, sep, after = after.partition('\\')
    else:
        o += before
        return o

Maybe this function get this error.

roll commented

@SunYanCN
Thanks, seems a bug

Are there any new developments on this issue? @roll

roll commented

@SunYanCN
Not yet

BTW I don't really understand the reason for this exception. Whey is FinalBackslashInFieldIs Forbidden? I would create a question on the tsv issue tracker first (and actually I don't see we can fix it on the tabulator level anyway)

Are you interested in leading this investigation?

stale commented

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

roll commented

Closing as it needs to be created for https://github.com/solidsnack/tsv if it's still actual