├── CAIS #CAIS datasets
│ ├── dataLoder #Different forms of CAIS datasets loading method
│ │ ├── GlobalPointerDataloder.py
│ │ ├── mergeCAISDataloder.py
│ │ └── sourceDataloder.py
│ ├── GlobalPointerCAIS #The location of the original datasets entity to represent
│ │ ├── dev.json
│ │ ├── test.json
│ │ └── train.json
│ ├── mergeCAIS #Fill in the intention of the original datasets with the slot and merge
│ │ ├── dev.txt
│ │ ├── test.txt
│ │ └── train.txt
│ ├── source #original datasets
│ │ ├── test
│ │ │ ├── ch.test
│ │ │ └── ch.test.intent
│ │ ├── train
│ │ │ ├── ch.train
│ │ │ └── ch.train.intent
│ │ └── valid
│ │ ├── ch.valid
│ │ └── ch.valid.intent
│ └── SourceToGlobalPointer.py #datasets conversion program
├── README.md
└── SMP
├── GlobalPointer
│ ├── GlobalPointerSMP2019 #The location of the original datasets entity to represent
│ │ ├── dev.json
│ │ ├── test.json
│ │ └── train.json
│ └── GlobalPointerSMP2020
│ ├── dev.json
│ ├── test.json
│ └── train.json
├── GlobalPointerToMerge.py #datasets conversion program
├── merge #Fill in the intention of the original datasets with the slot and merge
│ ├── 2019mergeSMP
│ │ ├── dev.txt
│ │ ├── test.txt
│ │ └── train.txt
│ └── 2020mergeSMP
│ ├── dev.txt
│ ├── test.txt
│ └── train.txt
├── source #original datasets
│ ├── 2019train.json
│ └── 2020train.json
└── sourceToGlobalPointer.py #datasets conversion program
CAIS Origin from the paper CM-Net: A Novel Collaborative Memory Network for Spoken Language Understanding,CAIS dataset includes 7,995 training, 994 validation and 1024 test utterances.Original data can be downloaded from Github
model | Slot(F1) | Intent(Acc) | Overall(Acc) |
---|---|---|---|
Slot-Gated | 82.21 | 93.87 | 80.43 |
SF-ID Network | 86.34 | 94.66 | 84.09 |
CM-Net | 86.16 | 94.56 | - |
Stack-Propagation | 87.65 | 94.57 | 84.68 |
Multi-Level Word Adapter | 88.57 | 94.66 | 85.47 |
SMP comes from8th National Social Media Processing Conference (SMP 2019)和Ninth National Social Media Processing Conference (SMP 2020))Evaluate Chinese Dialogue Technology (ECDT) task.Because the competition has ended, it is divided into the training datasets。According to the number of intentions, the original training set is divided into a training set at the ratio of 8: 1: 1, and the verification set and test set。SMP2019 dataset includes 2,053 training, 256 validation and 270 test utterances.SMP2020 dataset includes 4,011 training, 493 validation and 520 test utterances.
model | Slot(F1) | Intent(Acc) | Overall(Acc) |
---|---|---|---|
Slot-Gated | 62.94 | 91.11 | 57.03 |
SF-ID Network | 71.59 | 94.07 | 63.33 |
CM-Net | - | - | - |
Stack-Propagation | 78.91 | 94.44 | 72.59 |
Multi-Level Word Adapter | 73.60 | 93.70 | 70.00 |
model | Slot(F1) | Intent(Acc) | Overall(Acc) |
---|---|---|---|
Slot-Gated | 70.45 | 91.15 | 65.65 |
SF-ID Network | 78.47 | 92.69 | 71.34 |
CM-Net | - | - | - |
Stack-Propagation | 82.50 | 94.03 | 76.15 |
Multi-Level Word Adapter | 84.32 | 96.34 | 80.76 |
The detailed code of our paper will be uploaded soon.