tipr/bagit

Filenames with carriage returns are improperly written to manifest

nkrabben opened this issue · 3 comments

Any carriage return in a filename (Icon^M is a common system file with this feature) is stripped when it is written to the filename. According to the bagit spec, filenames should be percent encoded.

This was also a bug in the bagit-python and bagit-java
LibraryOfCongress/bagit-python#12
LibraryOfCongress/bagit-java#51

To reproduce, on the command line

mkdir testbag
touch testbag/Icon^M
bagit baginplace testbag
bagit verifyvalid testbag

To create Icon^M on the command line, press ctrl+v and then enter.

Bagger and bagit-python add the following line to the manifest.
d41d8cd98f00b204e9800998ecf8427e data/Icon%0D

@nkrabben I've been working on this application over here:
https://github.com/little9/bagit

I'm not sure if @tipr is still active.