py_parse Using beautifulsoup to extract and tidy files ready for analysis. Todo bring code from webcrawler to create a single project create CLI to run various components