(Bi-)Weekly NLP Research Paper Series

	Direct Submission	ARR Commit	Author Response	Notification	Conference	Notice
SIGDIAL	05/11	06/18	-	07/02	09/07 - 09/09	Edinburgh
COLING	05/17	-	-	08/15	10/12 - 10/15	Gyeongju, Korea
EMNLP	06/24	07/24	08/23 - 08/29	10/06	12/07 - 12/11	Abu Dhabi, (ARR Withdraw: 05/24)
AACL	07/15	08/21	08/15 - 08/21	09/20	11/21 - 11/24	Taiwan

ACL Rolling Review	06/01, 07/15, 09/01, 10/15, 12/01, 01/15/2023

(Conference deadlines: https://aideadlin.es/?sub=ML,CV,NLP,RO,SP or https://ccfddl.github.io/)

⭐ Goals:

Primary for sharing knowledge across different domains and catching up on recent updates.
Contents:
- Mainly and only collect interesting papers.
- Summarize the approaches and frameworks.
- Write strengths and weaknesses, and share potential applications to other domains.
- Highlight some exciting papers. Template :
  - Title: the paper title
  - Summary: strengths and weaknesses
  - Deserve to note: specific paragraphs or designs deserve to be noted or further reading

🤖 Schedule:

This document will keep updating and release (bi-)weekly every Friday.

❤️ Welcome:

You are more than welcome to invite any people and edit any parts of the documents, including but not limited to deleting, adding, and modifying any parts.

🚀 Week 03 05/09/2022

Dialogue & Multi-modal

Deepmind

(Important paper) Flamingo: a Visual Language Model for Few-Shot Learning

Microsoft

Vision-Language-Audio: i-code: an integrative and Composable Multimodal Learning Framework

OpenAI

Dalle-2
A paper from Google X: Translation between Molecules and Natural Language

Question Answering & Retrieval

RETRO (Deepmind): Borgeaud, Sebastian, Arthur Mensch, Jordan Hoffmann, Trevor Cai, Eliza Rutherford, Katie Millican, George van den Driessche et al. "Improving language models by retrieving from trillions of tokens." arXiv preprint arXiv:2112.04426 (2021). [pdf]

🚀 Week 02 04/29/2022

Dialogue Related Papers

RL for Dialog (NAACL 2022) - BY Sergey Levine

Seeker and it relevant papers

Language Models that Seek for Knowledge: Modular Search & Generation for Dialogue and Prompt Completion
- It outperforms GPT-3 regarding hallucination, and it is better than Blenderbot 2.0
- The paper is partially motivated by Reason first, then respond: Modular Generation for Knowledge-infused Dialogueand Blenderbot 2.0. Many people may already know that Meta treats the task-oriented dialog (TOD) system Cairaoke as one essential component of Metaverse, and Cairaoke integrates Blenderbot 2.0 to exhibit empathetic language and personality. In addition, Internet-Augmented Dialogue Generation is the code paper for Blenderbot 2.0, and I personally treat it as one excellent paper of that year.
- Deserve to note:
  - It integrates a search engine into the open-domain dialogue generation. The search engine firsts search the Internet to retrieve documents and keep the Top 5. Then a knowledge module to select more relevant knowledge from the retrieved documents. Finally, a response module will consider relevant context and knowledge while generating a response.
  - The knowledge selection module utilizes the Fusion-to-decoder (FID) model, initially designed for open-domain question answering. Kurt’s EMNLP 2020 paper further applies the FID model and RAGto open-domain chit-chat systems and shows impressive improvements. Note that FID fixes the retrieval model during the training process, while RAG jointly trains the model and the retrieval model.
  - Open-domain question answering models have shown great potential in open-domain chit-chat systems, and the retrieval-augmented models further improve their performances. Is it possible that we can add or design retrieval-augmented models to the task-oriented dialogue systems? As such the system could keep refresh through searching the Internet.
DAIR: Data Augmented Invariant Regularization
- Data augmentation techniques on MultiWOZ and SGD datasets. The techniques are also successfully used in the Cairaoke project.
UniGDD: A Unified Generative Framework for Goal-Oriented Document-Grounded Dialogue
Commonsense Reasoning for Conversational AI: A Survey of Recent Datasets and Benchmarks

Conversational Recommendation:

ARR April 2022

RID: A Unified Framework for Conversational Recommender Systems with Pretrained Language Models

WSDM 2022

C2-CRS: Coarse-to-Fine Contrastive Learning for Conversational Recommender System

CRS Lab

A toolkit for conversational recommendation systems

Question Answering

A Memory Efficient Baseline for Open Domain Question Answering

🚀 Week 01 04/22/2022

Dialogue Related Papers

ACL 2022
- Dialog state tracking:
  - Beyond the Granularity: Multi-Perspective Dialogue Collaborative Selection for Dialogue State Tracking
  - Continual Prompt Tuning for Dialog State Tracking
  - Towards Fair Evaluation of Dialogue State Tracking by Flexible Incorporation of Turn-level Performances
  - ASSIST: Towards Label Noise-Robust Dialogue State Tracking - (Findings of ACL) Shelby Heinecke
  - Dialogue Summaries as Dialogue States (DS2), Template-Guided Summarization for Few-shot Dialogue State Tracking - (Findings of ACL)
  - N-Shot Learning for Augmenting Task-Oriented Dialogue State Tracking - (Findings of ACL)
- DialogVED: A Pre-trained Latent Variable Encoder-Decoder Model for Dialog Response Generation
- Internet-Augmented Dialogue Generation - Kurt
- Multimodal Dialogue Response Generation
- ProphetChat: Enhancing Dialogue Generation with Simulation of Future Conversation
- SalesBot: Transitioning from Chit-Chat to Task-Oriented Dialogues
- UniTranSeR: A Unified Transformer Semantic Representation Framework for Multimodal Task-Oriented Dialog System
- UniDU: Towards A Unified Generative Dialogue Understanding Framework
  - It designs a unified generative framework for dialogue understanding task, including dialogue summary (DS), dialogue completion (DC), slot filling (SF), intent detection (ID) and dialogue state tracking (DST).
  - The task query can be regarded as the task-specific prompt, which includes the task definition and domain-related information.
  - It shows good few-shot and zero-shot performance.
  - Deserve to note:
  - In general, this paper uses a similar architecture to T0, UnifiedSKG and PPTOD. All of them have text-to-text pattern and use multi-task learning. They have shown impressive performance on few-shot and zero-shot learning.
  - The intent name of negative sample is “not defined”, where the input utterances Un are sampled from out-of-domain dialogues. The ratio of negative and positive samples for both DST and ID is set to 2:1.
  - It is interesting to see whether it will take a long time to train the model and whether it can only generate pre-defined classes rather than random tokens.
  - [Image: image.png]
- Other papers (low priority):
  - An Interpretable Neuro-Symbolic Reasoning Framework for Task-Oriented Dialogue Generation
  - Knowledge Enhanced Reflection Generation for Counseling Dialogues
  - CICERO: A Dataset for Contextualized Commonsense Inference in Dialogues
ACL 2022 - Findings
- Towards Large-Scale Interpretable Knowledge Graph Reasoning for Dialogue Systems
- Data Augmentation and Learned Layer Aggregation for Improved Multilingual Language Understanding in Dialogue
- Multi-Stage Prompting for Knowledgeable Dialogue Generation
ARR open-review April
ARR open-review March
- Learning to Predict Persona Information for Dialogue Personalization without Explicit Persona Description
ARR open-review Feb
- Controllable Multi-attribute Dialog Generation with PALs and Grounding Knowledge
- Simulating Inconsistencies in Task-oriented Dialog
ARR open-review Jan

Question Answering

Towards Unsupervised Dense Information Retrieval with Contrastive Learning

It evaluates the models on the BEIR benchmark, where the benchmark contains 18 retrieval datasets with a focus on diversity. Most datasets do not contain a training set and the focus of the benchmark is zero-shot retrieval.
It shows SOTA performances on unsupervised learning and few-shot learning. The unsupervised pre-training alone outperforms BERT with intermediateMS-MARCOfine-tuning.
Deserve to note:
- It explores the limits of contrastive learning as a way to train unsupervised dense retrievers, and show that it leads to strong retrieval performance.
- The ways to build positive pairs and negative pairs are interesting.
  - Building positive pairs from a single document: (1) Inverse Cloze Task: it uses the tokens of the span as the query and the rest of the tokens as the document (or key); (2) Independent cropping: It samples independently two spans from a document to form a positive pair.
  - Building large set of negative pairs: (1) Negatively pairs within a batch based on SimCLR. (2) Negative pairs across batches where queries are generated from the elements of the current batch and keys are the elements stored in the queue. The technique is proposed by MoCO.

Improving Passage Retrieval with Zero-Shot Question Generation LOOPITR: Combining Dual and Cross Encoder Architectures for Image-Text Retrieval RocketQA: An Optimized Training Approach to Dense Passage Retrieval for Open-Domain Question Answering Augmented SBERT: Data Augmentation Method for Improving Bi-Encoders for Pairwise Sentence Scoring Tasks Improving Efficient Neural Ranking Models with Cross-Architecture Knowledge Distillation In-Batch Negatives for Knowledge Distillation with Tightly-Coupled Teachers for Dense Retrieval Improving Bi-encoder Document Ranking Models with Two Rankers and Multi-teacher Distillation

Conversational Recommendation Systems

Two tutorials:

WSDM 2022

C2-CRS: Coarse-to-Fine Contrastive Learning for Conversational Recommender System

ACL ARR April

RID: A Unified Framework for Conversational Recommender Systems with Pretrained Language Models ****

jianguoz/BiWeekly-Research-Paper-Series

(Bi-)Weekly NLP Research Paper Series

🚀 Week 03 05/09/2022

Dialogue & Multi-modal

Question Answering & Retrieval

🚀 Week 02 04/29/2022

Dialogue Related Papers

Conversational Recommendation:

Question Answering

🚀 Week 01 04/22/2022

Dialogue Related Papers

Question Answering

Conversational Recommendation Systems

Recommendation Systems

CV & Multi-modal