AutoRD

  • AutoRD: An Automatic and End-to-end Rare Disease Knowledge Graph Construction Framework Based on Ontologies-enhanced Large Language Models

Quick Start

# data preprocessing
cd data_preprocessing

# put original dataset RareDis at `data_preprocessing/data` and rename it and cover it at `data_preprocessing/data/RareDis-fixed`

# fix some data annotation errors
cd RareDis-fixed
python fix_data.py

cd ../
python generate_data.py
python parse_HOOM.py
python parse_mondo_obo.py
python parse_ORDO.py


# put new generated dataset and processed ontologies at `./data`

# run main code
cd ../
bash run.sh
  • Find your results at ./cache_data

Baselins

## run baselines
# fine-tuning baseline
cd finetuning_baseline
python main.py

# pure LLM baseline
cd LLM_baseline
python main.py