/vicoqa-cmc

Vietnamese Conversational Machine Comprehension dataset (UIT-ViCoQA)

Primary LanguageJupyter Notebook

Vietnamese Conversational Machine Comprehension dataset (UIT-ViCoQA)

This datset is used for Conversational Machine Comprehension task in Vietnamese.
The UIT-ViCoQA dataset consists of 10,000 questions with answers over 2,000 conversations about health news articles.

Data inquiry

The dataset is available at this website
Please contact the following authors for dataset:

Baseline codes

The baseline codes implemented for the dataset including DrQA, SDNet, FlowQA and GraphFlow

Publication

Please cite this paper if you use the dataset:

Luu, Son T., et al. "Conversational Machine Reading Comprehension for Vietnamese Healthcare Texts." 
International Conference on Computational Collective Intelligence. Springer, Cham, 2021.

Link: https://arxiv.org/abs/2105.01542