Project work on the academic practice of the 2nd year of the HSE university
Rosstat web-scraper for obtaining official published inflation values.
Since Rosstat doesn't have its own API, in order to conveniently obtain official data on Russian inflation today, you need to contact financial analytical companies and pay for their data. This script receives official values using web-scraping, extracting the Excel tables from the HTML code and processing them into a convenient format.
Final product: weekly cron-job for collecting and updating data.
- python3
- requests
- BeautifulSoup
- pandas, numpy
- SQLAlchemy
-
Install python3
-
Clone the repository and change the directory
$ git clone https://github.com/sd-denisoff/cpi-parser.git && cd cpi-parser
-
Create a virtual environment and activate it
$ virtualenv --python=python3 venv $ source venv/bin/activate
-
Install dependencies
$ pip3 install -r requirements.txt
-
Run the script
$ python3 rosstat_parser.py
Web-scraping result:
Developed by Stepan Denisov