frictionlessdata/tabulator-py

Bug introduced to s3 paths with spaces?

Closed this issue · 4 comments

Overview

It seems like you made a commit (e97ec9f) that was supposed to fix an issue with s3 paths and spaces. Instead this seems to have introduced a bug that makes s3 paths with spaces NOT load properly. Maybe it's interacting with some custom code of mine that is preprocessing those load paths, but I am unable to find it. The error I'm getting is "Failed to find the file s3://path/to/file with spaces.csv in s3".

I can try to get reproduce it with code when I get a chance, but for me the issue happens whenever an s3 path is loaded that has a space in it.


Please preserve this line to notify @roll (lead of this repository)

OK I'm just now noticing that you made this change in response to my own issue frictionlessdata/frictionless-py#501 :)

Sorry that it take me a while to get to this, but it seems like this fix you made for goodtables created issues in datapackage-pipelines/dataflows for s3 files with spaces.

roll commented

Thanks @cschloer,

I'll investigate. I think the fix was correct (Frictionless has a test for it - https://github.com/frictionlessdata/frictionless-py/blob/master/tests/plugins/test_aws.py#L86) but it might have been hacked in dataflows somehow.

BTW is it for all paths with spaces or e.g. only Unicode etc?

Hey, it turns out this is on my end. I was calling requote_uri somewhere in my code to change the spaces to %20