Script quits when \n is in a filename
dd388 opened this issue · 3 comments
Most linebreaks in filenames on testing disk images have been \r, but I have spotted one that ends in \n. Since the parser splits on \n when pulling in output from hls
this causes a line that doesn't match either regular expression and the script fails.
This hopefully is a really weird edge case, but I should account for it.
Possible solution--
Add two more regexes to quickly identify the start of a file or directory line. If the file line does not end in a double quote (and or if the next line consists of just double quotes), assume this is because a line split broke before the end of the quoting of the filename. Add a double quote to the end of the string and re-parse. Skip the next line that consists of one pair of double quotes. Do the same for directories.
This is, of course, the best I can factor at the end of the day on a Friday. Maybe I'll have a better solution later.
If a line does not end in a quote AND the next line is a quote, we can assume that there is an \n
in that filename. I think.
Cleaning up after the stray linebreak afterwards -- as implied by my suggestion to myself in May -- is not trivial.
Possible solution here: https://github.com/cul-it/hfs2dfxml/tree/betterparse
Needs testing.