L13/vscode-diff

Feature Request: Add option to ignore Byte Order Mark

Closed this issue · 5 comments

Hello,

first off thank you for the wonderful VSCode plugin!

One feature I'm missing is the option to ignore any byte order marks for UTF8/UTF16/UTF32 encoded files. Currently, comparing two files with the same content and encoding, one saved with and the other without BOM, gets correctly picked up as modified.

In my use case, it would be beneficial if after enabling the option to try to detect & ignore BOMs the plugin shows the files as unmodified.

What do you think?

L13 commented

I thought about this idea the past days and removing the UTF-8 BOM is an easy task because the first 127 Bytes are the same as ASCII, but the UTF-16 is not as easy because removing leading/trailing whitespace and normalize line endings can end up in a guessing game with this feature. I need some time to figure out what can be a comfortable solution. Either the extension nor VS Code supports UTF-32.

The extension is a spare time project. New features require some time including 1 to 2 weeks of testing. So there are no time table or something like this.

Hi @L13, alright, I could always update the files to have the same encoding with or without BOMs and then use your plugin to compare the contents. That might be tedious for a large number of files, but that could be automated via scripts, too.

If you think it's not feasable you may ofc close this issue.
Thank you anyway for the great plugin.

L13 commented

Hi, the info was just to inform you that new features require some time.

I have still written the prototype for utf-8 including unit tests. So it works. But the utf-16 task is a little bit more challenging.

So I cannot say when the task is done, but it is in the making because it is a valid task.

L13 commented

The new version ist out now. I added a new option to ignore BOMs for UTF-8 and UTF-16BE files. This feature is active by default, but can also be disabled. The info in the list view and the stats in the output have been improved, too. It should be now easier to figure out the differences.

Just tried the new version and it's working as expected/requested.
I disabled the automatic detection, compared again and the files get picked up as having changes.

Thank you L13!