Multimodal Recommendation Dialog with SUbjective PREference (SURE)

We introduces a new dataset SURE (Multimodal Recommendation Dialog with SUbjective PREference), which contains 12K shopping dialogs in complex store scenes. The data is built in two phases with human annotations to ensure quality and diversity. SURE is well-annotated with subjective preferences and recommendation acts proposed by sales experts. A comprehensive analysis is given to reveal the distinguishing features of SURE. Three benchmark tasks are then proposed on the data to evaluate the capability of multimodal recommendation agents. Based on the SURE, we propose a baseline model, powered by a state-of-the-art multimodal model, for these tasks.

Illustration of the SURE Dataset

Track Description

Tasks and Metrics

Sub-Task #1	Subjective Preference Disambiguation
Goal	To determine candidate attribute values according subjective preference.
Input	Dialog history with subjective preference in the latest round, Multimodal context
Output	Candidate attribute values
Example	Customer: "I prefer color of happiness." → yellow, brown, red
Metrics	Disam F1 / Precision / Recall

Sub-Task #2	Referred Region Understanding
Goal	To determine candidate object IDs according to regional reference.
Input	Dialog history with regional reference in the latest round, Multimodal context
Output	Candidate object IDs
Example	Salesperson: "Look at the shelf on the right. Are there any clothes that you like?" Customer: "Sorry, there is no garment that I am looking for in this region." → 12, 13, 16, 22, 31
Metrics	Refer F1 / Precision / Recall

Sub-Task #3-1	Multimodal Recommendation - Act Prediction
Goal	To predict the salesperson's act in the next dialog round
Input	Dialog history, Multimodal context
Output	Act of salesperson in the next dialog round
Example	Salesperson: "The price $299 is too expensive for me to afford" → Revise Attribute
Metrics	Act F1
Sub-Task #3-2	Multimodal Recommendation - Response Generation
Goal	To generate the salesperson's utterance in the next dialog round
Input	Dialog history, Multimodal context
Output	Salesperson's utterance in the next dialog round
Example	Customer: "I’d like to buy a sofa made by materials obtained from nature" → Salesperson: "You can consider the leather material, which is natural and smooth."
Metrics	BLEU-4 / ROUGH-L / METEOR
Sub-Task #3-3	Multimodal Recommendation - Item Recommendation
Goal	To predict the target object in the last round of dialog
Input	Dialog history without last round, Multimodal context
Output	Target object ID
Example	(Dialog history) → Salesperson: "How about the black, unadorned overcoat? <@1032>"
Metrics	Recom F1

Citation

If you want to publish experimental results with our datasets or use the baseline models, please cite the following articles:

Lisence

SURE is released under CC-BY-NC-SA-4.0, see LICENSE for details.

LYX0501/SURE

Multimodal Recommendation Dialog with SUbjective PREference (SURE)

Track Description

Tasks and Metrics

Citation

Lisence