StackOverflow scraper that selectively extracts verified and highly voted questions and answers from the StackOverflow website and saves it in a sqlite3 database file in as title and code along with the source url of each question.
This project is licensed under the MIT License.
- Selectively scrapes verified and high voted questions and answers from StackOverflow.
- Maintains records of processed questions to avoid duplication.
- Keeps track of scraped pages for efficient resumption of scraping.
- Utilizes JSON files for storing records and data persistence.
- Clone the repository:
git clone https://github.com/Hammad389/stackoverflow-scraper.git
- Install the necessary dependencies:
pip install -r requirements.txt
- Run the scraper:
python scraper.py