/MolPuzzle

[NeurIPS 24] Can LLMs Solve Molecule Puzzles? A Multimodal Benchmark for Molecular Structure Elucidation

Primary LanguagePython

MolPuzzle: Can LLMs Solve Molecule Puzzles? A Multimodal Benchmark for Molecular Structure Elucidation

Website Paper Dataset

πŸ”₯ Updates & News

  • 2024.09: πŸŽ‰πŸŽ‰ MolPuzzle has been accepted by NeurIPS 2024 Dataset and Benchmark Track as a spotlight!

πŸ’‘Overview

We present MolPuzzle, a benchmark comprising 234 instances of structure elucidation, which feature over 18,000 QA samples presented in a sequential puzzle-solving process, involving three interlinked subtasks: molecule understanding, spectrum interpretation, and molecule construction.

Screenshot 2024-07-11 at 17 58 31

The figure illustrates the problem of molecular structure elucidation alongside its analogical counterpart, the crossword puzzle, highlighting the parallels in strategy and complexity between these two intellectual challenges

πŸ“•Model Summary

Model Stage 1 Stage 2 Stage 3
GPT-4o βœ… βœ… βœ…
Claude-3 βœ… ❌ βœ…
Gemini-pro βœ… ❌ βœ…
GPT-3.5 βœ… ❌ βœ…
Gemini-3-pro-vision ❌ βœ… ❌
LLava1.5-8b ❌ βœ… ❌
Qwen-VL-Chat ❌ βœ… ❌
InstructBLIP-7b ❌ βœ… ❌
InstructBLIP-13b ❌ βœ… ❌
Llama3-8b βœ… ❌ ❌
Vicuna-7b βœ… ❌ ❌
Llama2-7b βœ… ❌ ❌
Llama2-13b βœ… ❌ ❌
Mistral-7b βœ… ❌ ❌

πŸ“ŠDataset Statistics

Screenshot 2024-07-11 at 18 19 17

The initial molecules were selected by referencing the textbook Organic Structures from Spectra, 4th Edition, available as an online PDF on ResearchGate. We chose 234 molecules based on spectrum tasks involving IR, MS, 1H-NMR, and 13C-NMR to reflect a difficulty level suitable for graduate students. To address copyright concerns, we excluded molecules with publicly available mass spectrometry (MS) spectra in open-source databases from our study. The remaining spectra were sourced from public resources, notably the PubChem database. For additional spectra that were unavailable, we used simulation methods and provided a Jupyter notebook to generate these data, ensuring high-quality spectra for analysis.

You can download the dataset at data

πŸ”§Usage Demos

  1. Install Required Packages
    Install the necessary Python packages by running:

    pip install -r requirements.txt
  2. API Key Setup

  • Add API keys for OpenAI, Claude, and Gemini models
  • Example Commands (Stage 2)
    • Generate Responses for IR Using Multiple Models

            python stage2.py --task IR --action generate_responses --models instructBlip-7B instructBlip-13B llava gpt-4 claude-v1 --iterations 3
    • Evaluate Responses for IR

            python stage2.py --task IR --action evaluate --models instructBlip-7B instructBlip-13B llava gpt-4 claude-v1 --iterations 3

☎️ Contact us

Kehan Guo: kguo2@nd.edu