Esports-Data-to-Text

Scripts and related materials for Commentary Generation from Data Records of Multiplayer Strategy Esports Game (Zihan Wang, Naoki Yoshinaga; NAACL SRW 2024).

TL;DR

In this work, we set up a task of generating esports commentaries from structured data records of multiplayer strategy games; we built large-scale datasets, designed a set of evaluation criteria, and evaluated strong LLM-based baselines to reveal remaining challenges.

Datasets

The first important contribution of this work is that we have constructed large-scale data-to-text datasets for one of the most popular esports games, League of Legends (LoL).

The statistics of our esports data-to-text datasets is as follow. The table also includes the information of common datasets for similar tasks to show that our datasets are comparable in size to these data-to-text datasets.

	LoL19 (core)	LoL19-21 (extended)	LoL-V2T Video2Text	RotoWire Basketball	GameKnot Chess
# games (matches)	220	650	157	4,853	11,578
# examples	3,490	10,590	9,723	4,853	298,008
Avg. length of input	540.47	541.10	N/A (video)	628.00	25.73
Avg. length of output	374.68	373.89	15.4	337.10	20.55

To protect data copyright and privacy, we will only provide the scripts used for collecting and processing the data, rather than the well-processed data itself. See this example.

Example Input (Original)

… , {"type": "WARD_PLACED", "timestamp": 905433, "wardType": "YELLOWTRINKET", "creatorId": 6}, {"type": "WARD_KILL", "timestamp": 908742, "wardType": "YELLOWTRINKET", "killerId": 1}, {"type": "WARD_PLACED", "timestamp": 908775, "wardType": "CONTROLWARD", "creatorId": 5}, …

Example Input (Linearized)

… WARDPLACED|type 905433|timestamp YELLOWTRINKET|wardType 6|creatorId WARDKILL|type 908742|timestamp YELLOWTRINKET|wardType 1|killerId WARDPLACED|type 908775|timestamp CONTROLWARD|wardType 5|creatorId …

Example Output

(linebreaks omitted)

… just to stay even in a map state g2 can get exclusive vision on an area then suddenly the Nautilus veigar will have a lot of zone control but so behind in map control it's more about quick wards …

Example Output (ChatGPT Refinement)

(work in progress, for experimental use only, not included in this work)

… just to stay even in a map state. G2 can get exclusive vision on an area, then suddenly the Nautilus and Veigar will have a lot of zone control, but so behind in map control, it's more about quick wards …

Related Resources

Further information can be found at Riot Developer and LoL Fandom Wiki (World Championship details and links to contest videos). It is also highly recommended to access the Discord Channel "Riot Games Third Party Developer Community", where there are experienced developers who are interested in dealing with LoL-related data (LoL, TFT, LoR, Val, etc.).

Evaluation Metrics

In this work, we use reference-based metrics for the other data-to-text tasks, and perform human evaluation based on the characteristics of esports.

Reference-based Metrics (Automatic)

Following the existing data-to-text tasks, we adopt the following metrics for automatic evaluation.

sacreBLEU
normalized Damerau-Levenshtein distance (text distance)
ROUGE-L
BERTScore
BARTScore (example of usage)

There are various implementations of these metrics; the links provided above are merely suggestions.

Task-specific Metrics (Human Scoring)

Because it is difficult to estimate the strategic depth, we gather human scores using criteria tailored for esports commentaries. Human scoring aims to judge whether the output commentary contains strategically relevant content such as players' intentions and teams' arrangements.

The criteria of human scoring is detailed in the following table.

Strategic depth	score
Based on the criteria for obtaining a score of 4, the strategic considerations are inspiring, providing insights to help learn from the skillful players and teams	5
Based on the criteria for obtaining a score of 3, the strategic considerations are sufficient and closely related to the game moment described by the structured data	4
Based on explaining the facts, the commentary also reflects several strategic considerations, such as the player’s intention and the team’s arrangement	3
The commentary only reflects the core event of the game moment described by the structured data, without providing any strategic consideration	2
The commentary reflects no facts or only a few facts of the game moment described by the structured data	1

Citation

Big welcome to star this repo and cite our work using the following BibTeX:

@inproceedings{wang2024commentary,
  title={Commentary Generation from Data Records of Multiplayer Strategy Esports Game},
  author={Wang, Zihan and Yoshinaga, Naoki},
  booktitle={Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 4: Student Research Workshop)},
  pages={263--271},
  year={2024}
}

ArnoZWang/esports-data-to-text