website-scraper/node-website-scraper

Scrape files with exact same names when they include french characters

Closed this issue · 4 comments

Hi,
I need the scraped files to have the exact same names as the original files. Instead, when the original filenames include french characters, I get:
Capture-d’écran-2020-09-01-à-15.58.14.png
becomes
Capture-dâ__écran-2020-09-01-à-15.58.14.png
Is there a way to preserve the exact original filenames ?

Hi @LucasDemea

Sounds like a bug, I need to check this

I suggest to customize generateFilename action as described here and try to receive correct filenames, hope it helps

I can confirm that this is a bug, root cause and possible workarounds are described in #454

We'll think how to fix it in future versions

The issue should be fixed by #482, the fix will be released in next version 5.1.0

Reopen because a version with a fix 5.1.0 was deprecated and reverted