/policy_crawler_v1

Crawl the latest policies

Primary LanguageHTML

Policy Crawler

Background

Many marketing professionals, salespeople, and company executives require access to the latest policy updates to align their market strategies accordingly. Numerous companies designate specific personnel or departments to crawl and organize information from the web, which is then provided to their staff or sold online. This project focuses on crawling the latest publicly released information from major government websites in East and South Mainland China. Initially, the information is disseminated freely via RSS feeds, with considerations for other distribution models in later stages.

Architecture

DDD

Packet dependent constraint

Like maven, constraint specific packages only use in specific layer, and layers can only invoke each other in a specific direction.

  • pdm - too new, wait for years

TODO

  • db redesign to contain: region, openapi summary, picture

  • different feed via params such as: .../zhejiang/feed

  • multimedia download

image download:
    https://fzggw.zj.gov.cn/art/2022/9/1/art_1599545_58934718.html 
    https://fzggw.zj.gov.cn/art/2022/11/1/art_1599545_58935041.html 

License

GPL