A scraper designed to obtain data from Maxima discount website.
I have come back to this project after almost 2 years, and am planning on finally finishing it. I have rebuilt the scraper to account for the new website and will automate the scraping with AWS server.
Obtains the following data on every item:
- Item image url
- Discount icon text
- Item name
- Item discount time
- Discount shop size
- Item price euro
- Item price cents
- Discount text decorator
- Discount facilitator
The scraperv2.py has to be run in the environment that has the folowing modules installed:
- pandas
- numpy
- Selenium
- requests
The scraper was created for personal use so there are no test that would ensure the data is collected properly, if the site structure changed the scraper would not work and return an empty csv file. I will be working on tests in the future.
That said, you are welcome to fork and work on improving this scraper :), and if there are any improvement that you would like to see do not hesitate to contact me.
- Could not reach the site - there may be a problem with requests library or the site is down.
- Some execution error - the code could be too old or the site structure could have been changed, do not hesitate to message me about problems.