CAIS and SMP

Directory

├── CAIS                                                    #CAIS datasets
│   ├── dataLoder                                           #Different forms of CAIS datasets loading method
│   │   ├── GlobalPointerDataloder.py
│   │   ├── mergeCAISDataloder.py
│   │   └── sourceDataloder.py
│   ├── GlobalPointerCAIS                                   #The location of the original datasets entity to represent
│   │   ├── dev.json
│   │   ├── test.json
│   │   └── train.json
│   ├── mergeCAIS                                           #Fill in the intention of the original datasets with the slot and merge
│   │   ├── dev.txt
│   │   ├── test.txt
│   │   └── train.txt
│   ├── source                                              #original datasets
│   │   ├── test
│   │   │   ├── ch.test
│   │   │   └── ch.test.intent
│   │   ├── train
│   │   │   ├── ch.train
│   │   │   └── ch.train.intent
│   │   └── valid
│   │       ├── ch.valid
│   │       └── ch.valid.intent
│   └── SourceToGlobalPointer.py                             #datasets conversion program
├── README.md
└── SMP
    ├── GlobalPointer
    │   ├── GlobalPointerSMP2019                             #The location of the original datasets entity to represent
    │   │   ├── dev.json
    │   │   ├── test.json
    │   │   └── train.json
    │   └── GlobalPointerSMP2020
    │       ├── dev.json
    │       ├── test.json
    │       └── train.json
    ├── GlobalPointerToMerge.py                              #datasets conversion program
    ├── merge                                                #Fill in the intention of the original datasets with the slot and merge
    │   ├── 2019mergeSMP
    │   │   ├── dev.txt
    │   │   ├── test.txt
    │   │   └── train.txt
    │   └── 2020mergeSMP
    │       ├── dev.txt
    │       ├── test.txt
    │       └── train.txt
    ├── source                                               #original datasets
    │   ├── 2019train.json
    │   └── 2020train.json
    └── sourceToGlobalPointer.py                             #datasets conversion program

Introduction：

CAIS：

CAIS Origin from the paper CM-Net: A Novel Collaborative Memory Network for Spoken Language Understanding，CAIS dataset includes 7,995 training, 994 validation and 1024 test utterances.Original data can be downloaded from Github

CAIS baseline

model	Slot(F1)	Intent(Acc)	Overall(Acc)
Slot-Gated	82.21	93.87	80.43
SF-ID Network	86.34	94.66	84.09
CM-Net	86.16	94.56	-
Stack-Propagation	87.65	94.57	84.68
Multi-Level Word Adapter	88.57	94.66	85.47

SMP：

SMP comes from8th National Social Media Processing Conference (SMP 2019)和Ninth National Social Media Processing Conference (SMP 2020))Evaluate Chinese Dialogue Technology (ECDT) task.Because the competition has ended, it is divided into the training datasets。According to the number of intentions, the original training set is divided into a training set at the ratio of 8: 1: 1, and the verification set and test set。SMP2019 dataset includes 2,053 training, 256 validation and 270 test utterances.SMP2020 dataset includes 4,011 training, 493 validation and 520 test utterances.

SMP2019 baseline

model	Slot(F1)	Intent(Acc)	Overall(Acc)
Slot-Gated	62.94	91.11	57.03
SF-ID Network	71.59	94.07	63.33
CM-Net	-	-	-
Stack-Propagation	78.91	94.44	72.59
Multi-Level Word Adapter	73.60	93.70	70.00

SMP2020 baseline

model	Slot(F1)	Intent(Acc)	Overall(Acc)
Slot-Gated	70.45	91.15	65.65
SF-ID Network	78.47	92.69	71.34
CM-Net	-	-	-
Stack-Propagation	82.50	94.03	76.15
Multi-Level Word Adapter	84.32	96.34	80.76

The detailed code of our paper will be uploaded soon.

1053399472/CAISandSMP