FTAD - Fine-grained Turn-taking Action Dataset

Dataset for paper "Human-to-Human Conversation Dataset for Learning Fine-grained Turn-taking Action".

Overview

All data files are under data

utterrances contains the strutural annotation of dialogue transcription
- utter.txt: corpus of dialogue, with each line as an IPU, column separated by tab
- decision.txt: the list of turn-taking decision points and corresponding actions of each dialogue, column separated by tab
tasks contains the turn-taking prediction task data constructed from utterrances, with train dev test contain each of the following task data file:
- eot.txt: end of turn prediction
- break.txt: response prediction at opponent's interruption
- word_backchannel.txt: sequential prediction task for backchannel
- response_latency.txt: expected response time prediction

session_id: session id from Switchboard
type: from one speaker’s perspective, 'user' stand for the opponent and 'agent' stand for himself.
id: utterance sequential id for each speaker, starting from 0
begin_time: begin time (in milliseconds) of current utterance
end_time: end time (in milliseconds) of current utterance
begin_decision_id: corresponding decision point id at utterance begin, -1 means no DP can be associated to this utterance. Only available for agent utterrances.
end_decision_id: corresponding decision point id at utterance end, -1 means no DP can be associated to this utterance. Only available for agent utterrances.
text: the utterrance text
ext_msg: some additional annotations generated by pipeline, like shrink tag etc.

session_id: session id from Switchboard
id: id of decision point
time: timestamp for the decision point
state: duplex state of the dialogue at the moment of DP (illustrated at the top)
bias: the error (in milliseconds) of the aligment between DP at closest utterance, negative value means utterance's event comes before DP
act: the action which suject has taken at this DP
ext_msg: some additional annotations generated by pipeline

context：last three utterrances before DP, separated by '|'
subject_utterrance: the suject utterrance being interrupted
opponent_utterrance: the utterrance interrupting the subject by the opponent
label: 1 means subject accept the interruption and stop speech, while 0 means opposite.