/multimodal_dialog_summary

Summary of papers and projects for visual dialog, video dialog, and multimodal dialog (Updating)

About

Summary of papers and projects for visual dialog, video dialog, and multimodal dialog

Visual Dialog

Year 2023

  • TCSVT 2023 Heterogeneous Knowledge Network for Visual Dialog link

Year 2022

  • CVPR 2022 UTC: A Unified Transformer With Inter-Task Contrastive Learning for Visual Dialog link
  • ArXiv 2022 Modeling Coreference Relations in Visual Dialog link
  • ICASSP 2022 Improving Cross-Modal Understanding in Visual Dialog Via Contrastive Learning link
  • Information Processing & Management 2022 HVLM: Exploring Human-Like Visual Cognition and Language-Memory Network for Visual Dialog link
  • Pattern Recognition 2022 VD-PCR: Improving visual dialog with pronoun coreference resolution link

Year 2019

  • CVPR 2019 Recursive Visual Attention in Visual Dialog link

Year 2017

  • CVPR 2017 Visual Dialog link

Video Dialog/ AVSD/ Video-grounded Dialog Generation

Year 2022

  • EMNLP 2022 Findings Collaborative Reasoning on Multi-Modal Semantic Graphs for Video-Grounded Dialogue Generation link
  • NAACL 2022 VGNMN: Video-grounded Neural Module Networks for Video-Grounded Dialogue Systems link
  • EMNLP 2022 Information-Theoretic Text Hallucination Reduction for Video-grounded Dialogue link
  • AAAI 2022 Workshop Audio Visual Scene-Aware Dialog Generation with Transformer-based Video Representations link
  • ICIP 2022 Video-Grounded Dialogues with Joint Video and Image Training link
  • ECCV 2022 Video Dialog as Conversation about Objects Living in Space-Time link code

Year 2021

  • TASLP 2021 End-to-End Recurrent Cross-Modality Attention for Video Dialogue link
  • TASLP 2021 Bridging Text and Video: A Universal Multimodal Transformer for Audio-Visual Scene-Aware Dialog link
  • AAAI 2021 Dynamic Graph Representation Learning for Video Dialog via Multi-Modal Shuffled Transformers link
  • AAAI 2021 Structured Co-reference Graph Attention for Video-grounded Dialogue link
  • ICLR 2021 Learning Reasoning Paths over Semantic Graphs for Video-grounded Dialogues link

Year 2020

  • TCSVT 2020 Video Dialog via Multi-Grained Convolutional Self-Attention Context Multi-Modal Networks link
  • NAACL 2020 Video-Grounded Dialogues with Pretrained Generation Language Models link
  • EMNLP 2020 BiST: Bi-directional Spatio-Temporal Reasoning for Video-Grounded Dialogues link code

Year 2019

  • ACL 2019 Multimodal Transformer Networks for End-to-End Video-Grounded Dialogue Systems link code
  • ICASSP 2019 End-to-end Audio Visual Scene-aware Dialog Using Multimodal Attention-based Video Features link

Multimodal Dialog

Year 2022

  • ICASSP 2022 A Non-Hierarchical Attention Network with Modality Dropout for Textual Response Generation in Multimodal Dialogue Systems link
  • ArXiv 2022 Multimodal Dialog Systems with Dual Knowledge-enhanced Generative Pretrained Language Model link

Year 2021

  • SIGIR 2021 MMConv: An Environment for Multimodal Conversational Search across Multiple Domains link
  • ACMMM 2021 Multimodal Dialog System: Relational Graph-based Context-aware Question Understanding link

Year 2020

  • ACMMM 2020 Multimodal Dialogue Systems via Capturing Context-aware Dependencies of Semantic Elements link

Year 2019

  • ACMMM 2019 Multimodal Dialog System: Generating Responses via Adaptive Decoders link
  • SIGIR 2019 User Attention-guided Multimodal Dialog Systems link
  • ACL 2019 Ordinal and Attribute Aware Response Generation in a Multimodal Dialogue System link

Year 2018

  • ACMMM 2018 Knowledge-aware Multimodal Dialogue Systems link
  • AAAI 2018 Towards Building Large Scale Multimodal Domain-Aware Conversation Systems link