gr.File should add .html extension to URLs if possible
Opened this issue · 1 comments
Describe the bug
When using gr.File, the file uploaded keeps its extension. However, when using a URL like https://liquipedia.net/starcraft2/Adept
, the file saved in the cache has no extension. This is technically correct. However, since it's HTML being downloaded, it should really save with the .html
extension (equivalent to right-click and Save As in the browser). This makes it easier for a user to identify where it came from and also makes it easier for file type detection (avoiding the need for things like libmagic). Maybe it could just be a "default_web_extension" option that the user sets in the control.
The data source is lost when the upload happens (probably for security reasons?), so once a file has been downloaded to cache, you can't tell what it was originally or where it came from. The workaround would be to assume no extension means HTML, but that's a little error prone too.
Have you searched existing issues? 🔎
- I have searched and found no existing issues
Reproduction
Just use gr.File to upload a URL by pasting a URL into the File Explorer pop-up that appears when clicking Upload.
Screenshot
No response
Logs
No response
System Info
gradio==5.4.0
Severity
I can work around it
We cannot make the assumption that that a url without extensions is an html file, e.g. https://api.github.com/users/octocat returns a JSON for example. A default_web_extension
seems too fine-grained imo to add to the component. Open to other suggestions, but another option would be to create a custom component for your use case! https://www.gradio.app/guides/five-minute-guide, and we're happy to help.