archesproject/arches

Django default_storage breaks on non-ascii characters in files

Closed this issue · 4 comments

When importing a csv, often Django's default_storage module is used to handle the file, like so:

with default_storage.open(csv_file_path, mode="r") as csvfile:
            reader = csv.reader(csvfile)
            data = {"csv": [line for line in reader], "csv_file": csv_file_name}

However the default encoding used is ascii, creating an error for non-ascii characters. Django's default_stoarge.open method does not take an encoding= kwarg like the python open method does, so there is no good way to force utf-8 encoding when opening files this way.

@whatisgalen you might consider posting on the django forum to gauge appetite for doing the same thing for the storage interface that was done for the file interface in 5.0 to allow all kwargs through to open() (assuming still a problem in 5.1)

@whatisgalen you might consider posting on the django forum to gauge appetite for doing the same thing for the storage interface that was done for the file interface in 5.0 to allow all kwargs through to open() (assuming still a problem in 5.1)

Just posted!

However the default encoding used is ascii

Just noting that this platform dependent. Were you testing with Windows?

Just noting that this platform dependent. Were you testing with Windows?

Ubuntu 20 actually