This project was aimed at scraping the public badge profiles of the Google Cloud Study Jam participants. This data can then be used in any project through a MongoDB database. This is especially helpful for creating a leaderboard for the program.
- Clone the repository in your desired directory
- Store the public profile URLs in the
./src/data
directory asinput.csv
(Don't worry, you only need to do this once, this is required to fetch the public badge profile URLs) - (IMPORTANT!) The CSV must be formatted as below with the two fields:
Student Name
andProfile URL
- Create your own
.env
and store your own MongoDB Atlas URI there. (See.env.example
) - Install the
node_modules
withornpm install
oryarn install
pnpm install
- Run locally using
or
npm run dev
oryarn dev
pnpm dev
- All scraped data will be used to update the Mongo Database, you can then use that data to create your own leaderboard.
- You can create a cronjob to automate the scraping once every 30 mins - 1 hour.
ENJOY 😉☕