requests
library to download the “Topics on GitHub” page,BeautifulSoup
library to parse and extract information like top topics and repositories under each topic with detailsPandas
to convert into a final dataframe.
- Scrape https://github.com/topics
- Get a list of all topics. For each topic, get topic
title
, topic pageURL
and topicdescription
- For each topic, Get the top 25 repositories in the topic from the topic page
- For each repository, Grab the repo
name
,username
,stars
and repoURL
- At last create a CSV file by compling all scraped data
title
- Name of the topic - [3D]description
- Description of that topics - [3D modeling uses specialized software to create a digital model of a physical object. It is an aspect of 3D computer graphics, used for video games, 3D printing, and VR, among other applications.]url
- URL of that topic - [https://github.com/topics/3d]
repo_name
- Name of the repositoryusername
- Owner of that repositorystars
- Stars on that repositoryrepo_url
- URL of that repository