Add support to parse files on local file system.
jeremyje opened this issue · 1 comments
jeremyje commented
Is your feature request related to a problem? Please describe.
It'd be nice if Ferret could parse HTML files that are on the file system.
Describe the solution you'd like
Enhance the HTTP driver to accept file://
URLs. Most of the mechanics of the HTTP work well for a local file. The cookies can be stubbed to nil since there's no HTTP context.
Describe alternatives you've considered
It's possible to run an HTTP server within process but for large scale data processing this becomes inefficient.
I've written #777 and it appears to work within my code (not public yet).
Additional context
Add any other context or screenshots about the feature request here.
ziflex commented
Hey,
Have you tried to use the following query?
LET bin = IO::FS::READ(@filepath)
LET doc = PARSE(TO_STRING(bin))
RETURN doc.innerHtml