/Wikipedia-WebCrawler

Wikipedia Web Crawler written in Python and Scrapy. The ETL process involves multiple steps, extracting specific data from multiple wikipedia web pages/links using scrapy and organizing it into a structured format using scrapy items. Additionally, the extracted data is saved in JSON format for further analysis and integration into MySQL Workbench.

Primary LanguagePythonMIT LicenseMIT

Wiki-WebCrawler-Scrapy

Wikipedia Web Crawler written in Python and Scrapy. The ETL process involves multiple steps, extracting specific data from wikipedia's web page using scrapy and organizing it into a structured format using scrapy items. Additionally, the extracted data is saved in JSON format for further analysis and integration into MySQL Workbench. The JSON dataset serves as a potential data source for an API, enhancing data accessibility.