/ViRe4MRC

ViRe4MRC is the first benchmark for review-based machine reading comprehension in Vietnamese. ViRe4MRC contains 6,603 data points, human-generated from 2,174 reviews on two domains: restaurant and smartphone.

ViRe4MRC at PACLIC 37

Introduction

This repository contains the data of the paper MACHINE READING COMPREHENSION FOR VIETNAMESE CUSTOMER REVIEWS: TASK, CORPUS AND BASELINE MODELS.

ViRe4MRC is the first benchmark for review-based machine reading comprehension in Vietnamese. ViRe4MRC contains 6,603 data points, human-generated from 2,174 reviews on two domains: restaurant and smartphone.

Notice: This dataset is published for research purposes only. The dataset is not intended for commercial use.

Data Example

example

Citation

@inproceedings{do-etal-2023-machine,
    title = "Machine Reading Comprehension for {V}ietnamese Customer Reviews: Task, Corpus and Baseline Models",
    author = "Do, Tinh Pham Phuc  and
      Cao, Ngoc Dinh Duy  and
      Nguyen, Nhan Thanh  and
      Huynh, Tin Van  and
      Nguyen, Kiet Van",
    editor = "Huang, Chu-Ren  and
      Harada, Yasunari  and
      Kim, Jong-Bok  and
      Chen, Si  and
      Hsu, Yu-Yin  and
      Chersoni, Emmanuele  and
      A, Pranav  and
      Zeng, Winnie Huiheng  and
      Peng, Bo  and
      Li, Yuxi  and
      Li, Junlin",
    booktitle = "Proceedings of the 37th Pacific Asia Conference on Language, Information and Computation",
    month = dec,
    year = "2023",
    address = "Hong Kong, China",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2023.paclic-1.3",
    pages = "24--35",
}

Contact

Authors:

Tinh Pham Phuc Do, Ngoc Dinh Duy Cao, Nhan Thanh Nguyen, Tin Van Huynh, Kiet Van Nguyen

Faculty of Information Science and Engineering, University of Information Technology, Ho Chi Minh City, Vietnam

Vietnam National University, Ho Chi Minh City, Vietnam

{20522020, 20521661, 20521701}@gm.uit.edu.vn and {tinhv,kietnv}@uit.edu.vn