Fake_News_Detection_using_sklearn_pipeline: A Jupyter Notebook repository from HayatiYrtgl

This Python code is a machine learning pipeline for classifying news articles as fake or real. Let's break down the analysis step by step:

Importing Libraries: The code begins by importing necessary libraries including pandas, numpy, and scikit-learn modules.
Read Data: It reads a CSV file containing news data into a pandas DataFrame.
Data Cleaning:
- It checks for null values in the DataFrame.
- It replaces empty strings in the 'text' column with NaN values.
- It drops rows with NaN values.
Data Exploration:
- It visualizes the distribution of labels ('fake' or 'real') using seaborn.
- It prints the counts of each label.
Train-Test Split: It splits the data into training and testing sets using 70% for training and 30% for testing.
Pipeline Creation:
- It creates a machine learning pipeline consisting of two steps:
  - TF-IDF Vectorization: Converts text data into numerical features using TF-IDF (Term Frequency-Inverse Document Frequency).
  - Linear Support Vector Classification (LinearSVC): A linear SVM classifier is used for text classification.
Model Training: It fits the pipeline to the training data.
Prediction: It predicts the labels for the test data.
Evaluation:
- It computes confusion matrix and prints it.
- It prints a classification report containing precision, recall, F1-score, and support for each class.

Overall, this code performs text classification on news articles using TF-IDF features and a LinearSVC classifier, and evaluates the model's performance using metrics such as precision, recall, and F1-score.

HayatiYrtgl/Fake_News_Detection_using_sklearn_pipeline