Welcome to my new DBT project!

Description

3.1M Records 1.2GB

This project uses DBT to analyze over 3 million events recorded by developers around the world on public GitHub repositories on April 16, 2023. The data was collected from the GitHub API and stored in a Snowflake database. DBT was used to transform the data into a more structured format and to generate analytical insights.

Hit the Start! ⭐

If you plan to use this repo for learning or find this content helpful, please hit the start. Thanks! 🙌🏻

Project Goals

The goal of this project was to use DBT to analyze a large dataset of GitHub events. The specific goals of the project were to:

  • Understand the types of events that are recorded by GitHub
  • Identify the most active developers , programming languages and repositories
  • Analyze the trends in GitHub activity
  • See my Github activity alexbonella

Project Tools

This project uses the following tools:

Project Methodology

The project was implemented using the following steps:

  • The GitHub API was used to collect the data.
  • The data was stored in a Snowflake database.
  • DBT was used to transform the data.
  • Analytical insights were generated using SQL queries.

Project Results

Top Repos

image

Top Languages

image

Top Events alexbonella

image

Project Conclusion

This project demonstrates how DBT can be used to analyze large amounts of data from GitHub. The results of this project can be used to gain insights into developer activity and trends around the world.