Web-Scraping

This repository contains Web Scraping Projects using Python

python -m venv /path/to/new/virtual/environment

Activate the Virtual Environment (command can vary on Windows or Linux based systems).

source </path/to/new/virtual/environment>/bin/activate

python -m venv .scrape
source .scrape/bin/activate

Learn Basics of Scraping by referring to Web Scraping.ipynb

Projects

WikiCompanies: This project utilizes Scrapy aims to scrape all US companies and its firmographics data from the Wikipedia infobox.
BuySellUnlistedShares: A simple web scraping script using BeautifulSoap that aims to scrape unlisted companies of India from URL: https://buysellunlistedshares.com/.
AZLyrics: Using BeautifulSoap and Selenium, this project aims to scrape music information such as artists, albums, songs, and lyrics from URL: https://azlyrics.com/.

Rurash Financials: This is a Selenium based web scraping script that aims to scrape unlisted companies of India from URL: https://rurashfin.com/unlisted-shares-list/.
UnlistedDeal: This is a Selenium based web scraping script that aims to scrape unlisted companies of India from URL: https://www.unlisteddeal.com/unlisted-share/.
UnlistedZoneShares: This is a Scrapy project that aims to scrape unlisted companies of India from URL: https://unlistedzone.com/shares/.
WWIPL: This is a Selenium based project that aims to scrape unlisted companies of India from URL: https://wwipl.com/.
UnlistedShareBrokers: This is a Selenium based web scraping script that aims to scrape unlisted companies of India from URL: https://www.unlistedsharebrokers.com/.

To create a new project, run the following command in the terminal: scrapy startproject my_first_spider
To create a new spider, first change the directory: cd my_first_spider
Create a spider: scrapy genspider example example.com. When we create a spider, we obtain a Basic Template. The class is built with the data we introduced in the previous command, but the parse method needs to be built by us.
Run the spider and export data to CSV or JSON

scrapy crawl example
scrapy crawl example -o name_of_file.csv
scrapy crawl example -o name_of_file.json

Directory Diagram of Scrapy

Process flow of the scraping process.