/Chem_LLM_Hackathon

Repository for the Imperial 2023 large language model hackathon.

Primary LanguageJupyter Notebook

Welcome to the 2023 Chemistry DigiFab Hackathon

Challenge 1: Retrosynthesis Prediction

In this challenge, you will be given a set of molecules and their corresponding retrosynthesis. Your task is to build a model that can predict the retrosynthesis of new molecules. In this challenge, the goal is to compare the performance of more tailored language models (the Molecular Transformer - https://github.com/pschwllr/MolecularTransformer) and more general language models (GPT-3) for the specific task of retrosyntyhesis prediciton. The starting point for this task is in the Task 1 folder.

Challenge 2: Large Language Models for General Science

In this afternoon task, the goal is to evaluate the performance of large language models (LLMs) at answering specific exam questions. The starting point is the knowledge extraction notebook in the Task 2 folder.

Using the automated scoring

Your progress on the challenges will be monitored using the automated scoring system. The general idea is to submit a pull request containing your best predictions. The scores will be calculated automatically and uploaded as a comment on your pull request. You can submit multiple predictions and track the progress of your scores over time.

The detailed instructions for using the scoring system are as follows:

  1. Navigate to the hackathon GitHub page: https://github.com/stevenkbennett/Chem_LLM_Hackathon
  2. Please speak to one of the demonstrators who will add you to the repository. You will need to do this before submitting files!
  3. Click “Add file” then “Upload files”
  4. Select “Create a new branch” and click “Propose changes”
  5. Click “Create pull request”. Your automated scores will become available after a few minutes.
  6. To submit new test files, just upload them again using the “Add file” button with the same branch name and the scores will be recalculated.

Tracking progress

Each task is worth 45 points each and a further 10 points will be allocated by the judges at the end of the hackathon. Each of the scores can be seen in the pull requests and on the leaderboard that will be on the screen during the day.