zkemail/zk-regex

Make an automated regex generator given just emails

Divide-By-0 opened this issue · 0 comments

If I attach two emails (i.e. raw .eml bodies), either two from the same template or one from an old template and one from a new template, we should be able to auto-generate a regex by:

  • detecting the maximum overlap in the raw text, and constraining those match
  • detecting the parts that differ and constraining them to the type of character correctly (i.e. a contiguous sequence with a single @ is constrained via email address regex, it autodetects mixes of decimal and hex, ascii, floats, etc)
  • allowing the user to highlight what the match group want to reveal via i.e. zkregex.com/tool, and loosening the constraints on the rest of the unmatched text

This will be critical for partners like zkp2p to both support new emails and rapidly adapt to template changes. As it is a whole, integrated project, there is a much larger bounty for this project.