APLC_XLNet: A Python repository from huiyegit

APLC_XLNet

Python ≥ 3.6

# We recommend you to use Anaconda to create a conda environment 
conda create --name aplc_xlnet python=3.6
conda activate aplc_xlnet

PyTorch ≥ 1.4.0

conda install pytorch==1.4.0 torchvision==0.5.0 cudatoolkit=10.1 -c pytorch

Download our preprocessed datasets from Google Drive and save them to data/

Create train.csv and dev.csv. Reference our preprocessed dataset for the format of the CSV file
Create labels.txt. Labels should be sorted in descending order according to their frequency
Count the number of positive labels of each sample, select the largest one in all samples, and assign it to the hyperparameter --pos_label
Add the dataset name into the dictionary processors and output_modes in the source file utils_multi_label.py
Create the bash file and set the hyperparameters in code/run/

Download our preprocessed datasets from Google Drive

For dataset EURlex, the raw text is from the website
For dataset Wiki500k, the raw text is from Google Drive
For dataset Wiki10, AmazonCat and Amazon670k, the raw texts are from The Extreme Classification Repository

Run the commands