LLM Science Exam

Inspired by the OpenBookQA dataset this competition challenges participants to answer difficult science-based questions written by a Large Language Model.

To make it simple the idea was that the dataset for this Kaggle competition was created from GPT-3.5 which is a 175-billion-parameter model. It would be amazing if an LLM which is smaller than its size probably in the range of 7-30 Billion parameters using quantization techniques like QLORA and LORA could solve the exam setup by another LLM.

Purpose of the Notebook

Load and preprocess the competition data 📁
Engineer relevant features for model training 🏋️‍♂️
Train predictive models to make target variable predictions 🧠.
Submit predictions to the competition environment 📤

What to expect from this project

Data Preparation: In this section, we load and preprocess the competition data.
Feature Engineering: We generate and select relevant features for model training.
Model Training: We train machine learning models on the prepared data.
Prediction and Submission: We make predictions on the test data and submit them for evaluation.

Disclaimer - To run this notebook you would need proper access to competition data for LLM Science Exam

sahibpreetsingh12/llm_science_exam

LLM Science Exam

Purpose of the Notebook

What to expect from this project