This Python project allows you to scrape movie or TV show data from IMDb using BeautifulSoup and requests. You can gather information such as movie ratings, cast, crew, and more.
- Scrape IMDb Data: The script scrapes data for movies, TV shows, or other IMDb content.
- Customizable URLs: You can specify IMDb URLs to scrape specific movie or TV show pages.
- Structured Data: The scraped data is organized into structured output formats (e.g., JSON, CSV).
Before running the script, make sure you have the following installed:
- Python 3.x
- BeautifulSoup (install via
pip
) - Requests library (install via
pip
)
- Clone this repository to your local machine:
git clone https://github.com/mrnithish/Web-scraping-IMDB.git
- Navigate to the project directory:
cd Imdb
-
Modify the script,
scrape.py
, to specify the IMDb URLs you want to scrape. You can customize the URLs for specific movies, TV shows, or other IMDb content. -
Run the script using the following command:
python scrape.py
- The scraped data will be displayed in the console or saved in a structured output file, depending on your script's configuration.
You can customize the script to scrape additional IMDb data by modifying the BeautifulSoup logic in the imdb_scraper.py
file. For example, you can extract specific information like cast, crew, release date, and more.
If you'd like to contribute to this project, please follow these steps:
- Fork the repository.
- Create a new branch (
git checkout -b feature/fooBar
). - Make your changes.
- Commit your changes (
git commit -am 'Add some fooBar'
). - Push to the branch (
git push origin feature/fooBar
). - Create a new Pull Request.
This project is licensed under the MIT License- see the LICENSE.md file for details.