/MetaIE

This is a meta-model distilled from LLMs for information extraction. This is an intermediate checkpoint that can be well-transferred to all kinds of downstream information extraction tasks.

Primary LanguagePython

MetaIE 🌐 [Paper]

This is a meta-model distilled from ChatGPT-3.5-turbo for information extraction. This is an intermediate checkpoint that can be well-transferred to all kinds of downstream information extraction tasks.

MetaIE

Link to MetaIE Paper

To begin 🚀

You need first to install the dependent packages.

pip install -r requirements.txt

Distillation Dataset Sampling 📖

You can create your own distillation dataset based on your own corpus:

python distillation_dataset_sampling.py <your OpenAI API key> <path to your corpus (e.g. example.txt)> <path to distillation dataset (e.g. distill/metaie.json)>

If you don't want to spend money, you can replace the train_file argument in the meta-learning script by KomeijiForce/MetaIE-Pretrain, which is used for our experiment.

Meta-learning 🤖

bash pretrain.sh

Pre-trained checkpoints 🔑

You can directly use our pre-trained MetaIE models for English and Multi-language from Huggingface. The readme in the Huggingface repo can help you to further understand the mechanism of MetaIE.

Update: A GPT-4-distilled Checkpoint is available now!

Update: A GPT-4o-distilled Checkpoint for Academia Domain is available now!

Dataset 📚

Our dataset for distillation is at Huggingface.

Downstream Scenario (CoNLL2003 as an instance) 🛠️

Fine-tuning 🔧

bash tune_ner.sh

Inference 🧠

python inference.py

Citation 📝

@article{MetaIE,
  author       = {Letian Peng and
                  Zilong Wang and
                  Feng Yao and
                  Zihan Wang and
                  Jingbo Shang},
  title        = {MetaIE: Distilling a Meta Model from {LLM} for All Kinds of Information
                  Extraction Tasks},
  journal      = {CoRR},
  volume       = {abs/2404.00457},
  year         = {2024},
  url          = {https://doi.org/10.48550/arXiv.2404.00457},
  doi          = {10.48550/ARXIV.2404.00457},
  eprinttype    = {arXiv},
  eprint       = {2404.00457},
  timestamp    = {Wed, 08 May 2024 17:22:41 +0200},
  biburl       = {https://dblp.org/rec/journals/corr/abs-2404-00457.bib},
  bibsource    = {dblp computer science bibliography, https://dblp.org}
}