Python Jupyter Notebook: Webscrapping

Description

Part of the IBM Data Analyst Professional Certificate program. This is a Lab where we go into depth of many of the techniques used in the feild of data science and data analysis to webscrap the internet for information.

From parsed Data from the web to DataFrames using useful libraries like Panda, Requests, BeautifulSoup, json and many more.

The table of content goes as follows:

- Beautiful Soup Object
Tag
Children, Parents, and Siblings
HTML Attributes
Navigable String

- Filter
find All
find
HTML Attributes
Navigable String

- Downloading And Scraping The Contents Of A Web

Languages and Utilities Used

  • Python Programming Language

Environments Used

  • Jupyter Notebook

Highlights:

Sample of the Webscraping lab 1
- Population Data from WikiPedia:
List of customers, sales reps, and total transaction amounts for each customer between 2011 and 2012

Sample of the Webscraping lab 2
- Top 10 Most Populated Countries from WikiPedia:
Disk Sanitization Steps