In this challenge, you will be given a set of molecules and their corresponding retrosynthesis. Your task is to build a model that can predict the retrosynthesis of new molecules. In this challenge, the goal is to compare the performance of more tailored language models (the Molecular Transformer - https://github.com/pschwllr/MolecularTransformer) and more general language models (GPT-3) for the specific task of retrosyntyhesis prediciton. The starting point for this task is in the Task 1 folder.
In this afternoon task, the goal is to evaluate the performance of large language models (LLMs) at answering specific exam questions. The starting point is the knowledge extraction notebook in the Task 2 folder.
Your progress on the challenges will be monitored using the automated scoring system. The general idea is to submit a pull request containing your best predictions. The scores will be calculated automatically and uploaded as a comment on your pull request. You can submit multiple predictions and track the progress of your scores over time.
The detailed instructions for using the scoring system are as follows:
- Navigate to the hackathon GitHub page: https://github.com/stevenkbennett/Chem_LLM_Hackathon
- Please speak to one of the demonstrators who will add you to the repository. You will need to do this before submitting files!
- Click “Add file” then “Upload files”
- Select “Create a new branch” and click “Propose changes”
- Click “Create pull request”. Your automated scores will become available after a few minutes.
- To submit new test files, just upload them again using the “Add file” button with the same branch name and the scores will be recalculated.
Each task is worth 45 points each and a further 10 points will be allocated by the judges at the end of the hackathon. Each of the scores can be seen in the pull requests and on the leaderboard that will be on the screen during the day.