Character `%2F` is automatically converted to `/` in URL param
severo opened this issue · 1 comments
severo commented
When passing a parquet URL that contains the character %2F
, it seems that the character is considered as /
, which convert the original URL to a different one.
See, for example, the file: https://huggingface.co/datasets/squad/resolve/refs%2Fconvert%2Fparquet/plain_text/squad-train.parquet.
The app gives the following error:
Error
Traceback (most recent call last):
File "/lib/python311.zip/_pyodide/_base.py", line 540, in eval_code_async
await CodeRunner(
File "/lib/python311.zip/_pyodide/_base.py", line 365, in run_async
await coroutine
File "<exec>", line 110, in <module>
File "/lib/python311.zip/pyodide/http.py", line 201, in bytes
self._raise_if_failed()
File "/lib/python311.zip/pyodide/http.py", line 125, in _raise_if_failed
raise OSError(
OSError: Request for https://huggingface.co/datasets/squad/resolve/refs/convert/parquet/plain_text/squad-train.parquet failed with status 404: Not Found
in these two cases:
- original: https://lite.datasette.io/?parquet=https://huggingface.co/datasets/squad/resolve/refs%2Fconvert%2Fparquet/plain_text/squad-train.parquet
- replacing
%2F
with '/' (does not exist): https://lite.datasette.io/?parquet=https://huggingface.co/datasets/squad/resolve/refs/convert/parquet/plain_text/squad-train.parquet
cc @julien-c
severo commented
Note, in case somebody looks at this issue to load a HuggingFace dataset with lite.datasette.io, we now provide a simpler API to access the parquet files:
It does not contain %2F
this time 😄