SemEval 2018 Task 4

Character Identification on Multiparty Dialogues

Character Identification is an entity linking task that identifies each mention as a certain character in multiparty dialogue. Let a mention be a nominal referring to a person (e.g., she, mom, Judy), and an entity be a character in a dialogue. The goal is to assign each mention to its entity, who may or may not participate in the dialogue. For the following example, the mention "mom" is not one of the speakers; nonetheless, it clearly refers to the specific person, Judy, that could appear in some other dialogue. Identifying such mentions as real characters requires cross-document entity resolution, which makes this task challenging.

Citation

SemEval 2018 Task 4: Character Identification on Multiparty Dialogues, Jinho D. Choi and Henry Y. Chen, Proceedings of the International Workshop on Semantic Evaluation, SemEval'18, 57-64, New Orleans, LA, 2018.

References

Robust Coreference Resolution and Entity Linking on Dialogues: Character Identification on TV Show Transcripts, Henry Y. Chen, Ethan Zhou, and Jinho D. Choi, Proceedings of the 21st Conference on Computational Natural Language Learning, CoNLL'17, 216-225 Vancouver, Canada, 2017.
Character Identification on Multiparty Conversation: Identifying Mentions of Characters in TV Shows, Henry Y. Chen and Jinho D. Choi, Proceedings of the 17th Annual SIGdial Meeting on Discourse and Dialogue, SIGDIAL'16, 90-100, Los Angeles, CA, 2016.

Organizers

Jinho D. Choi (Emory University).
Henry Y. Chen (Snap Inc.).

Datasets

The first two seasons of the TV show Friends are annotated for this task. Each season consists of episodes, each episode comprises scenes, and each scene is segmented into sentences. The followings describe the distributed datasets:

friends.train.episode_delim.conll: the training data where each episode is considered a document.
friends.train.scene_delim.conll: the training data where each scene is considered a document.
friends.test.episode_delim.conll: the test data where each episode is considered a document.
friends.test.scene_delim.conll: the test data where each scene is considered a document.
friends.test.episode_delim.conll.nokey: same as friends.test.episode_delim.conll; the gold keys are replaced by -1.
friends.test.scene_delim.conll.nokey: same as friends.test.scene_delim.conll; the gold keys are replaced by -1.

Note that the evaluation sets did not include the gold keys during the competition; we made them available after the competition. No dedicated development set was distributed for this task; feel free to make your own development set for training or perform cross-validation on the training sets.

Format

All datasets follow the CoNLL 2012 Shared Task data format. Documents are delimited by the comments in the following format:

#begin document (<Document ID>)[; part ###]
...
#end document

Each sentence is delimited by a new line ("\n") and each column indicates the following:

Document ID: /<name of the show>-<season ID><episode ID> (e.g., /friends-s01e01).
Scene ID: the ID of the scene within the episode.
Token ID: the ID of the token within the sentence.
Word form: the tokenized word.
Part-of-speech tag: the part-of-speech tag of the word (auto generated).
Constituency tag: the Penn Treebank style constituency tag (auto generated).
Lemma: the lemma of the word (auto generated).
Frameset ID: not provided (always _).
Word sense: not provided (always _).
Speaker: the speaker of this sentence.
Named entity tag: the named entity tag of the word (auto generated).
Entity ID: the entity ID of the mention, that is consistent across all documents.

Here is a sample from the training dataset:

/friends-s01e01  0  0  He     PRP   (TOP(S(NP*)    he     -  -  Monica_Geller   *  (284)
/friends-s01e01  0  1  's     VBZ          (VP*    be     -  -  Monica_Geller   *  -
/friends-s01e01  0  2  just   RB        (ADVP*)    just   -  -  Monica_Geller   *  -
/friends-s01e01  0  3  some   DT        (NP(NP*    some   -  -  Monica_Geller   *  -
/friends-s01e01  0  4  guy    NN             *)    guy    -  -  Monica_Geller   *  (284)
/friends-s01e01  0  5  I      PRP  (SBAR(S(NP*)    I      -  -  Monica_Geller   *  (248)
/friends-s01e01  0  6  work   VBP          (VP*    work   -  -  Monica_Geller   *  -
/friends-s01e01  0  7  with   IN     (PP*))))))    with   -  -  Monica_Geller   *  -
/friends-s01e01  0  8  !      .             *))    !      -  -  Monica_Geller   *  -

/friends-s01e01  0  0  C'mon  VB   (TOP(S(S(VP*))  c'mon  -  -  Joey_Tribbiani  *  -
/friends-s01e01  0  1  ,      ,                 *  ,      -  -  Joey_Tribbiani  *  -
/friends-s01e01  0  2  you    PRP           (NP*)  you    -  -  Joey_Tribbiani  *  (248)
/friends-s01e01  0  3  're    VBP            (VP*  be     -  -  Joey_Tribbiani  *  -
/friends-s01e01  0  4  going  VBG            (VP*  go     -  -  Joey_Tribbiani  *  -
/friends-s01e01  0  5  out    RP           (PRT*)  out    -  -  Joey_Tribbiani  *  -
/friends-s01e01  0  6  with   IN             (PP*  with   -  -  Joey_Tribbiani  *  -
/friends-s01e01  0  7  the    DT             (NP*  the    -  -  Joey_Tribbiani  *  -
/friends-s01e01  0  8  guy    NN            *))))  guy    -  -  Joey_Tribbiani  *  (284)
/friends-s01e01  0  9  !      .               *))  !      -  -  Joey_Tribbiani  *  -

A mention may include more than one word:

/friends-s01e02  0  0  Ugly         JJ   (TOP(S(NP(ADJP*  ugly         -  -  Chandler_Bing  *  (380
/friends-s01e02  0  1  Naked        JJ                *)  naked        -  -  Chandler_Bing  *  -
/friends-s01e02  0  2  Guy          NNP               *)  Guy          -  -  Chandler_Bing  *  380)
/friends-s01e02  0  3  got          VBD             (VP*  get          -  -  Chandler_Bing  *  -
/friends-s01e02  0  4  a            DT              (NP*  a            -  -  Chandler_Bing  *  -
/friends-s01e02  0  5  Thighmaster  NN               *))  thighmaster  -  -  Chandler_Bing  *  -
/friends-s01e02  0  6  !            .                *))  !            -  -  Chandler_Bing  *  -

The mapping between the entity ID and the actual character can be found in friends_entity_map.txt.

Evaluation

Your output must consist of the entity ID of each mention, one per line, in the sequential order. There are 6 mentions in the above example, which will generate the following output:

Given this output, the evaluation script will measure,

The label accuracy considering only 7 entities, that are the 6 main characters (Chandler, Joey, Monica, Phoebe, Rachel, and Ross) and all the others as one entity.
The macro average between the F1 scores of the 7 entities.
The label accuracy considering all entities, where characters not appearing in the tranining data are grouped as one entity, others.
The macro average between the F1 scores of all entities.
The F1 scores for 7 entities.
The F1 scores for all entities.

The following shows the command to run the evaluate.py:

python evaluate.py ref_out sys_out

ref_out: the reference output including the gold keys (download ref.out).
sys_out: the path to a file containing your system output; this should include 2,429 lines of keys, where each line indicates the entity ID of the corresponding mention.

Results

This task was hosted at CodaLab from 08/21/2017 to 01/29/2018: https://competitions.codalab.org/competitions/17310.

All Entities + Others

This evaluation considers all characters appearing in training, development, and evaluation sets as individual classes. Characters that appear only one or two of these sets are grouped as one class called OTHERS.

User ID	Label Accuracy	Average F1
AMORE UPF	74.72	41.05
Cheoneum	69.49	16.98
Kampfpudding	59.45	37.37
Zuma	25.81	14.42

Main Entities + Others

This evaluation considers 6 main characters as individual classes and all the other characters as one class called OTHERS.

User ID	Label Accuracy	Average F1
Cheoneum	85.10	86.00
AMORE UPF	77.23	79.36
Kampfpudding	73.36	73.51
Zuma AR	46.07	43.15

System Outputs + Detailed Evaluation

The system output from all participants as well as their detailed evaluation results.

User ID	Output	Evaluation
AMORE UPF	AMORE_UPF.out	AMORE_UPF.eval
Cheoneum	Cheoneum.out	Cheoneum.eval
Kampfpudding	Kampfpudding.out	Kampfpudding.eval
Zuma AR	Zuma.out	Zuma.eval

emorynlp/semeval-2018-task4