/fathom-comics

Comic extraction using Fathom

Primary LanguageJavaScript

The workflow is...

  1. Gather Comics to pull down the webarchives
  2. mv_random to divide the corpus
  3. enfolder.py to create a deeper folder hierarchy
  4. extract.js to extract rendered rects and HTML