An easy-to-use, powerful crawler for OLX, that allows collecting various non-sensitive data about ads on the site.
- 🦾 Enough performance
- 🎭 Anonymous, especially via Tor
- ⚖️ Non-sensitive data
- 🔍 Filtering by keywords
- ⛓️ Commands chaining
Demonstration of experience with Selenium for Web Scraping 💪. Analyzing non-sensitive data about ads on the site 🧐. No ready solutions for collecting data from the site 😢.
You will need to install only Google Chrome, thats all. No need manual installation of WebDriver binary. @SergeyPirogov thank you for WebDriver Manager.
- Clone the Repository
- Install this Package (
./setup.py install
) or install dependencies from Pipfile (pipenv install
)
olx ads --help # Show help for ads command and exit
olx ads "https://www.olx.ua/uk/zhivotnye/koshki/" # Collect all ads with cats
olx ads --no-free ... # Only paid ads
olx ads --no-paid ... # Only free ads
olx ads --kind --title --price --location ... # Collect extra fields
olx ad --help # Show help for ad command and exit
olx ad "https://www.olx.ua/d/uk/obyavlenie/laskovye-shotlandskie-malyshi-IDNyrf4.html" # Collect ad details
olx ad --keywords keywords.txt ... # Filter by keywords
olx ad --title --description --author --profile --price --location ... # Collect extra fields
olx ads --progress ... # Show progress
olx ads --no-headless ... # Disabled headless mode
olx ads --proxy "socks5://..." # Use proxy server
olx ads --all ... # Collect all fields
olx ads --no-link ... # Skip link field
olx ads "https://www.olx.ua/uk/zhivotnye/koshki/" | olx ad --all --progress > ads.csv # Commands chaining
👍🎉 First off, thanks for taking the time to contribute! 🎉👍
Contributions are what make the open source community such an amazing place to be learn, inspire, and create. Any contributions you make are greatly appreciated.
- Fork the Project
- Create your Feature Branch (
git checkout -b feature/awesome-feature
) - Commit your Changes (
git commit -m 'Add awesome feature'
) - Push to the Branch (
git push origin feature/awesome-feature
) - Open a Pull Request
Leave a ⭐ if you think this project is cool or useful for you.
olx-crawler
is licenced under the MIT License. See the LICENSE
for more information.