Web_Scraping_and_Classification

This repository contains Python code for classifying news articles into different categories using machine learning techniques. The code fetches articles from a news website, extracts headlines and content, and uses various models for classification.

Dependencies

Make sure you have the following libraries installed:

requests
BeautifulSoup (bs4)
pandas
scikit-learn
imbalanced-learn (imblearn)
seaborn
nltk

You can install them using:

pip install requests beautifulsoup4 pandas scikit-learn imbalanced-learn seaborn nltk

Usage

Run News_Scraping_and_Classification_final.ipynb to fetch news articles and create a CSV file (News_data.csv) with headlines, content, and categories.
Run news_classification.py to perform text classification using K-Nearest Neighbors and Support Vector Machine models. The optimal parameters are determined through cross-validation.

Folder Structure

News_Scraping_and_Classification_final: Contains the Code file of Web_Scraping and Classification
News_data.csv: Contains the CSV file with NEWS data.
README.md: Placeholder for images used in the README.
README.md: Documentation explaining the code and usage.

##Screenshots

github-pratik/Web_Scraping_and_Classification

Web_Scraping_and_Classification

Dependencies

Usage

Folder Structure