This code is based on DeepAL [1].
In EASE, we implement Active Learning with Data Augmentation and Sample Diversity for textual classfication.
It includes Python implementations of the following active learning algorithms:
- Least Confidence [2]
- Margin Sampling [3]
We use the pre-trained BERT [4] as basic classifier and fine-tune it. One can replace it to other basic classifiers and add them to nets.py.
- Python 3.10.12
- numpy 1.26.4
- pandas 2.1.4
- torch 2.3.1
- sklearn 1.3.2
- nlpaug 1.1.11
- transformers 4.42.4
- sentence_transformers 3.0.1
python demo.py \
--n_query 20 \
--n_init_labeled 20 \
--patient 5 \
--dataset_name SemEval_Restaurants \
--strategy_name LeastConfidence \
--active_learning True \
--enhanced_active_learning True \
--with_augmentation True \
[1] DeepAL: Deep Active Learning in Python, arXiv preprint, 2021
[2] A Sequential Algorithm for Training Text Classifiers, SIGIR, 1994
[3] Active Hidden Markov Models for Information Extraction, IDA, 2001
[4] Bert: Pre-training of deep bidirectional transformers for language understanding, arXiv preprint, 2018