fast-crawler

Download crawled datasets

Terminal 1

$ celery -A web_scraper_server.celery_app worker --concurrency 5 --loglevel=info

Terminal 2

$ run_server.sh

Terminal 3

$ run.sh

WORK TODO

  • Python으로 pdf reader 구현, pdf에서 text 추출
  • Investing.com crawl
  • 통계청 crawl
  • CNBC News crawl