/gdsc7

Capgemini & AWS GDSC7 Hackaton solution by Da_Vincis_PizzAI team

Primary LanguagePythonApache License 2.0Apache-2.0

Global Data Science Challenge 7


The grade-AI generation. Re-righting Education with data & AI


📖 Hi everyone and welcome to the 7th edition of the Global Data Science Challenge: The grade-AI generation! 📖

We’re thrilled to have you join us for this year’s event where we, the Large Human Models (LHM), team up with our new AI pals – the Large Language Models (LLM). Sure, we might not have everything in common (they may not need coffee breaks), but it’s amazing how well we can work together, especially when it comes to tackling challenges in education. So get ready to flex those prompt engineering muscles and dive into some seriously cool conversations with AI, all while supporting a great cause: education!

This year, our focus is on improving the analysis of massive amounts of form data collected from the Progress in International Reading Literacy Study (PIRLS). Students, parents, and teachers around the world have filled out a variety of questionnaires, addressing that timeless cornerstone of education: reading.

Why reading, you ask? Well, because education – and especially reading – empowers people. It fuels creativity, sparks innovation, and opens doors to new opportunities. Plus, it’s pretty handy when it comes to making sense of the world. Reading not only enhances critical thinking but also builds empathy and fosters an understanding of different cultures and ideas. Essentially, reading is like a superpower that unites us across borders and helps us thrive in an increasingly complex, information-packed world.

In this challenge, you’ll get the chance to team up with LLMs and make the process of digging through heaps of questionnaire data more efficient. Your mission? To create an AI agent capable of answering complex questions about PIRLS – but with a twist: your agent needs to retrieve real data instead of relying on its memory. This won’t just require your prompt engineering skills, but also some serious data analysis, database querying, and a solid understanding of LLMs (along with their delightful quirks).

As much as this is a competition, remember that it’s also an opportunity to connect and collaborate with your fellow participants. The Global Data Science Challenge is a community of talented minds from around the world, all ready to share ideas, learn from each other, and dive deep into the hot new world of LLMs. And hey, let’s not forget to have a little fun while we’re at it!

So, buckle up buttercup – education needs you! The Grade-AI Generation is calling for your creativity, ingenuity, and problem-solving prowess. Together, we can make a real difference in the world of education, one line of code at a time.

Good luck to you all, and see you on the battlefield! (menacing foreshadowing) ⚔️ 📖

What can you find in this repo?


This repo contains the solution by Da_Vincis_PizzAI team, whose member are:

The solution gave the team 6th place out of 405 teams.

The repo is only for showing purposes.
To be run, it would require a connection to the PIRLS 2021 Database and an AWS account, which cannot be provided.