Shared Task in NLPCC 2022: Multimodal Product Summarization

The multimodal product summarization task aims at generating a condensed textual summary for a given product. The input contains a detailed product description, a product knowledge base, and a product image.

Dataset
- Dataset statistics
- An example in the dataset
Baselines
- Text-only version
- Multimodal version
Evaluation
- Human Evaluation
Participation
Award
Important Dates
Reference

Dataset

The dataset for this task is collected from JD.COM, a mainstream Chinese e-commerce platform. Each sample is a (product textual description, product knowledge base, product image, product summary) tuple. The dataset contains three product categories, including Cases&Bags, Home Appliances, and Clothing.

Dataset statistics

	Cases&Bags	Home Appliances	Clothing
Train	50,000	100,000	200,000
Valid	10,000	10,000	10,000
Test	10,000	10,000	10,000
Input token/sample	319.0	336.6	294.8
Product attribute/sample	14.8	7.8	7.3
Product image/sample	1	1	1

An example in the dataset

Input text	TCL D49A630U 49英寸超薄金属机身30核HDR 4K超清智能电视机（黑色）一个化繁为简的系统，暗的更暗，亮的更亮，您再也不会埋怨影片不够看了，4K超高清+全生态HDR，金属压铸支架，浑然天成的品质感，4K超高清屏幕，这是89万像素RGB真4K，30核心，性能再度升级，4K大屏新旗舰电视，让场景接近您感受的真实度，64位电视处理器，20000转/分高速打磨，硬解码，腾讯视频，全新升级，电视不止更薄，显示还更出色，金属纤薄机身，丰富接口，45°纹钻切工艺，4K超高清内容，一体成型LG0，强大的扩展能力，HDr实时转化，全生态HDR，24小时更新不断，金属外观设计，精美的UI设计，海量影视资源，集成了主页、影视、生活等
Product image
Product attribute	能效等级：3级电视类型：4K超清屏幕尺寸：49英寸 HDR显示：支持HDR
Output summary	这款TCL智能液晶电视，海量片库，任你观看。4K高清画质，画面更加清晰，给你身临其境之感。搭载画质新宠HDR技术，还原真实自然的画面。

Baselines

K-PLUG: Knowledge-injected Pre-trained Language Model for Natural Language Understanding and Generation in E-Commerce. (Findings of EMNLP 2021).

Text-only version

code

Multimodal version

code

Note: submission of the baselines are NOT acceptable.

Evaluation

The results need to be submitted to the leaderboard of the challenge. The evaluation consists of two stages, an automatic evaluation, and a human evaluation. For automatic evaluation, we adopt the metric of ROUGE. We select the top-5 teams regarding the average score of ROUGE-1, ROUGE-2, and ROUGE-L to advance to the second round of evaluation, i.e., the human evaluation. For the human evaluation, we evaluate faithfulness, readability, non-redundancy, and importance, for 100 random sampled summaries each category. The final ranking is determined by the average score of the human evaluation.

Human Evaluation

Metrics	Scores
Faithfulness	0=unfaithful to the input 2=faithful to the input
Readability	0=hard to understand 1=partially hard to understand 2=easy to understand
Non-redundancy	0=full of redundant information 1=partially redundant information 2=no redundant information
Importance	0=no useful information 1=partially useful information 2=totally useful information

Participation

If you are interested in our challenge, please fill out the application form and email lihaoran24 at jd.com (Please email us with your organization's email and note that you participate in the challenge). The dataset will be sent to you.

Award

The top 3 participating teams will be certificated by NLPCC and CCF-NLP, as well as awarded cash rewards.

The first prize (*1): ￥3000

The second prize (*1): ￥2000

The third prize (*1): ￥1000

Important Dates

Announcement of shared tasks and call for participation: 2022/3/15

Registration open: 2022/3/15

Release of detailed task guidelines & training data: 2022/3/15

First submission of results on the blind test data: 2022/5/1

Registration deadline: 2022/5/5

Participants’ results submission deadline: 2022/6/3

Evaluation results release and call for system reports and conference paper: 2022/6/10

Conference paper submission deadline (only for shared tasks): 2022/6/20

Conference paper accept/reject notification: 2022/7/4

Camera-ready paper submission deadline: 2022/7/18

Reference

[1] Li et. al., Aspect-Aware Multimodal Summarization for Chinese E-Commerce Products. AAAI 2020.

[2] Yuan et. al., On the Faithfulness for E-commerce Product Summarization. COLING 2020.

[3] Xu et. al., Self-Attention Guided Copy Mechanism for Abstractive Summarization. ACL 2020.

[4] Xu et. al., K-PLUG: Knowledge-injected Pre-trained Language Model for Natural Language Understanding and Generation in E-Commerce. Findings of ACL: EMNLP 2021.

jd-aig/multimodal-product-summarization-challenge