/autoplius

Autoplius scrapper + VIN number image decoding and reading + implemented captcha solver

Primary LanguageJupyter Notebook

Autoplius ads scraper

Implemented tasks:

  • URL's gathering from autoplius sitemap
  • Defined data fields
  • Free proxy usage feature(by default disabled)
  • Implemented VIN code decoder and image reader using OCR tools
  • Implemented automatic Captcha solver (There is no need to use proxies at this point)

Tasks to do:

  • [] Create virtual environment for easier necessary package installation
  • [] Merge new ads links with scraped url's list
  • [] Improve automatic Captcha solver by not hardacoding crop values(if Selenium window is at different size, captcha is not properly cropped)
  • [] Make it faster by implementing Asynchronous requests

If you have any recommendations or code improvements - feel free to contact me.

Please note that this is side self learning project. Treat https://en.autoplius.lt/ data with respect and use this code at your own risk.