/BertForChID

A repository for Idiom NER and Idiom Cloze, using BERT model and ChID dataset.

Primary LanguagePython

Introduction

This is a repository for Idiom NER and Idiom Cloze

referring to ChID: A Large-scale Chinese IDiom Dataset for Cloze Test.

Data Process

  1. Download ChID dataset into data/chid folder from here

    including train_data.txt, dev_data.txt, test_data.txt files

  2. Download bert-base-chinese model into data/bert folder from here

    including config.json, vocab.txt, pytorch_model.bin files

Main Process

  • Task One: Idiom NER
python main1.py --name NER
  • Task Two: Idiom Cloze
python main2.py --name Cloze

You can modify the configuration through command line parameters or parser.py