/DefenceCompanies-WebScraping-And-Cleaning

Top 100 Defence Companies for Each Year from 2005

Primary LanguageJupyter Notebook

DefenceCompanies-WebScraping-And-Cleaning

This repository contains data and code used to get and clean data from https://people.defensenews.com/top-100/

Defence News Data

  • The dataset has scraped from Defence News.
  • Defense News is a global website and magazine about politics, business and technology of defense. Defense News serves an audience of senior military, government and industry decision-makers throughout the world.

The dataset has shared on Kaggle

  • Kaggle is a subsidiary of Google Inc., and is the world's largest data science community.

Dataset Link: > https://www.kaggle.com/onurduman/defence-companies-top-100-for-each-year-from-2005

My Notebook: > https://www.kaggle.com/onurduman/defence-companies-top-100-cleaning-eda

Files

Python Script

  • web_scraping_script.py - Script used to get and save data from Defence News.

Notebook

  • data_cleaning.ipynb - Notebook used to clean and save data as cleaned.

Data

  • defence_companies_from_2005.csv - The dataset which scraped from Defence News thanks to Script

  • defence_companies_from_2005_cleaned.csv - The dataset which cleaned from Python Notebook