/alexa-with-dstc10-track2-dataset

DSTC10 Track 2 - Knowledge-grounded Task-oriented Dialogue Modeling on Spoken Conversations

Primary LanguagePython

License

DSTC10 Track 2 - Knowledge-grounded Task-oriented Dialogue Modeling on Spoken Conversations

This repository contains the data, scripts and baseline codes for DSTC10 Track 2.

This challenge track aims to benchmark the robustness of the conversational models against the gaps between written and spoken conversations. Specifically, it includes two target tasks: 1) multi-domain dialogue state tracking and 2) task-oriented conversational modeling with unstructured knowledge access. For both tasks, participants will develop models using any existing public data and submit the model outputs on the unlabeled test data set with the ASR outputs.

Organizers: Seokhwan Kim, Yang Liu, Di Jin, Alexandros Papangelis, Behnam Hedayatnia, Karthik Gopalakrishnan, Dilek Hakkani-Tur

News

Important Links

If you want to publish experimental results with this dataset or use the baseline models, please cite this article:

@misc{kim2021how,
      title={"How robust r u?": Evaluating Task-Oriented Dialogue Systems on Spoken Conversations}, 
      author={Seokhwan Kim and Yang Liu and Di Jin and Alexandros Papangelis and Karthik Gopalakrishnan and Behnam Hedayatnia and Dilek Hakkani-Tur},
      year={2021},
      eprint={2109.13489},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Timeline

  • Validation data released: Jun 14, 2021
  • Test data released: Sep 13, 2021
  • Entry submission deadline: Sep 21, 2021
  • Objective evaluation completed: Sep 28, 2021
  • Human evaluation completed: Oct 8, 2021

Rules

  • Participation is welcome from any team (academic, corporate, non profit, government).
  • Each team can participate in either or both sub-tracks by submitting up to 5 entries for each track.
  • The identity of participants will NOT be published or made public. In written results, teams will be identified as team IDs (e.g. team1, team2, etc). The organizers will verbally indicate the identities of all teams at the workshop chosen for communicating results.
  • Participants may identify their own team label (e.g. team5), in publications or presentations, if they desire, but may not identify the identities of other teams.
  • Participants are allowed to use any external datasets, resources or pre-trained models which are publicly available.
  • Participants are NOT allowed to do any manual examination or modification of the test data.
  • All the submitted system outputs with the evaluation results will be released to public after the evaluation period.

Contact

Join the DSTC mailing list to get the latest updates about DSTC10

For specific enquiries about DSTC10 Track2

Please feel free to contact: seokhwk (at) amazon (dot) com