gildas-lormeau/SingleFile

Have you reconsidered adding WARC support?

YousufSSyed opened this issue · 2 comments

I saw the issue for it posted all the way back in 2019 and I think its a really good time to look at supporting the WARC format.

  • There's a lot more software that supports it now.
  • It can be viewed in the browser with sites like https://replayweb.page.
  • WARCs (both .warc and .warc.gz) can easily be concatenated, with unix cat in the command line for instance.
  • When combined, their resources can be deduplicated, allowing space to be saved.
  • Can be added to Web replay software like Pywb.
  • There's currently no good addons to download pages as WARCs, Warcreate is only available on Chrome and not Firefox. Other software requires lots of setup outside of the browser.

The fundamental problem is that SingleFile does not inspect network exchanges, whereas the WARC format was designed to do just that. That's why I haven't delved into the subject.

I'm moving this issue in the Discussions tab to keep it visible to people and discuss about this subject.