decrypto-org/spider

Several Crawls in one DB

Opened this issue · 0 comments

New Table: Crawls: Id & human readable id

on Paths: Secure flag could evolve over time, could be shifted to the content, meaning: content was found with secure flag true/false

Contents: Add

  • Crawl ID foreign key
  • secure flag

Links: Add

  • Source Content

Whenever Crawler is ran:
Configuration: Tag and random id => find control ID and use this
Paths: we will consider all paths that finished before the current start
Assumption: we only ever run one crawl at a time