TEXT Open Intent Recognition (TEXTOIR)

TEXTOIR is the first high-quality Text Open Intent Recognition platform. This repo contains a convenient toolkit with extensible interfaces, integrating a series of state-of-the-art algorithms of two tasks (open intent detection and open intent discovery). We also release the pipeline framework and the visualized platform in the repo TEXTOIR-DEMO.

Introduction

TEXTOIR aims to provide a convenience toolkit for researchers to reproduce the related text open classification and clustering methods. It contains two tasks, which are defined as open intent detection and open intent discovery. Open intent detection aims to identify n-class known intents, and detect one-class open intent. Open intent discovery aims to leverage limited prior knowledge of known intents to find fine-grained known and open intent-wise clusters. Related papers and codes are collected in our previous released reading list.

Open Intent Recognition:

We strongly recommend you to use our TEXTOIR toolkit, which has standard and unified interfaces (especially data setting) to obtain fair and persuable results on benchmark intent datasets!

Benchmark Datasets

Integrated Models

Open Intent Detection

Learning Discriminative Representations and Decision Boundaries for Open Intent Detection (DA-ADB, arXiv 2022)
Out-of-Scope Intent Detection with Self-Supervision and Discriminative Training ((K+1)-way, ACL 2021)
Deep Open Intent Classification with Adaptive Decision Boundary (ADB, AAAI 2021)
Unknown Intent Detection Using Gaussian Mixture Model with an Application to Zero-shot Intent Classification (SEG, ACL 2020)
Deep Unknown Intent Detection with Margin Loss (DeepUnk, ACL 2019)
DOC: Deep Open Classification of Text Documents (DOC, EMNLP 2017)
A Baseline For Detecting Misclassified and Out-of-distribution Examples in Neural Networks (MSP, ICLR 2017)
Towards Open Set Deep Networks (OpenMax, CVPR 2016)

Open Intent Discovery

Semi-supervised Clustering Methods
- Discovering New Intents with Deep Aligned Clustering (DeepAligned, AAAI 2021)
- Discovering New Intents via Constrained Deep Adaptive Clustering with Cluster Refinement (CDAC+, AAAI 2020)
- Learning to Discover Novel Visual Categories via Deep Transfer Clustering (DTC*, ICCV 2019)
- Multi-class Classification Without Multi-class Labels (MCL*, ICLR 2019)
- Learning to cluster in order to transfer across domains and tasks (KCL*, ICLR 2018)
Unsupervised Clustering Methods
- Towards K-means-friendly Spaces: Simultaneous Deep Learning and Clustering (DCN, ICML 2017)
- Unsupervised Deep Embedding for Clustering Analysis (DEC, ICML 2016)
- Stacked auto-encoder K-Means (SAE-KM)
- Agglomerative clustering (AG)
- K-Means (KM)

(* denotes the CV model replaced with the BERT backbone)

Quick Start

Use anaconda to create Python (version >= 3.6) environment

conda create --name textoir python=3.6
conda activate textoir

Install PyTorch (Cuda version 11.2)

conda install pytorch torchvision torchaudio cudatoolkit=11.0 -c pytorch -c conda-forge

Clone the TEXTOIR repository, and choose the task (Take open intent detection as an example).

git clone git@github.com:HanleiZhang/TEXTOIR.git
cd TEXTOIR
cd open_intent_detection

Install related environmental dependencies

pip install -r requirements.txt

Run examples (Take ADB as an example)

sh examples/run_ADB.sh

Extensibility

This toolkit is extensible and supports adding new methods, datasets, configurations, backbones, dataloaders, losses conveniently. More detailed information can be seen in the directory open_intent_detection and open_intent_discovery respectively.

Citations

If this work is helpful, or you want to use the codes and results in this repo, please cite the following papers:

@inproceedings{zhang-etal-2021-textoir,
    title = "{TEXTOIR}: An Integrated and Visualized Platform for Text Open Intent Recognition",
    author = "Zhang, Hanlei  and Li, Xiaoteng  and Xu, Hua  and Zhang, Panpan and Zhao, Kang  and Gao, Kai",
    booktitle = "Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing: System Demonstrations",
    pages = "167--174",
    year = "2021",
    url = "https://aclanthology.org/2021.acl-demo.20",
    doi = "10.18653/v1/2021.acl-demo.20",
}

@article{zhang2022towards,
  title={Learning Discriminative Representations and Decision Boundaries for Open Intent Detection},
  author={Zhang, Hanlei and Xu, Hua and Zhao, Shaojie and Zhou, Qianrui},
  journal={arXiv preprint arXiv:2203.05823},
  year={2022}
}

Contributors

Hanlei Zhang, Shaojie Zhao, Xin Wang, Ting-En Lin, Qianrui Zhou, Huisheng Mao.

Bugs or questions?

If you have any questions, please open issues and illustrate your problems as detailed as possible. If you want to integrate your method in our repo, please feel free to pull request!

sunzeyeah/TEXTOIR