/Reasoning-for-Sentiment-Analysis-Framework

The official code for CoT / ZSL reasoning framework 🧠, utilized in paper: "Large Language Models in Targeted Sentiment Analysis in Russian"

Primary LanguagePythonMIT LicenseMIT

Reasoning for Sentiment Analysis • twitter

Open In Colab arXiv

Update 06 September 2024: Mentioning the related information about the project at BU-research-blog

Update 11 August 2024: 🎤 Announcing the talk on this framework @ NLPSummit 2024 with the preliminary ad and details in X/Twitter post 🐦. twitter

Update 05 June 2024: The frameworkless lauch of the THoR-tuned FlanT5 model for direct inference is now available in GoogleColab.

Update 02 June 2024: 🤗 Models were uploaded on huggingface: 🤗 nicolay-r/flan-t5-tsa-prompt-xl and other models smaller sizes as well. Check out the related section.

Update 22 May 2024: ⚠️ in GoogleColab you might find zero evaluation results (see #8 issue) (0) on test part, which is due to locked availablity on labels 🔐. In order to apply your model for test part, please proceed with the Official RuSentNE-2023 codalab competition page or at Github.

Update 06 March 2024: 🔓 attrdict represents the main limitation for code launching in Python 3.10 and hence been switched to addict (see Issue#7).

🔥 Update 24/04/2024: We released fine-tuning log for the prompt-based and THoR-based techniques applied for the train competition data as well as checkpoints for downloading. More ...

💻 Update 19/04/2024: We open quick_cot code repository for lauching quick CoT zero-shot-learning / few-shot-learning experiments with LLM, utilied in this studies. More ...

📊 Update 19/04/2024: We open a separate 📊 👉RuSentNE-benchmark repository👈 📊 for LLM-resonses, including answers on reasoning steps in THoR CoT for ChatGPT model series. More ...

Studies and Collection of LLM-based reasoning frameworks for Target Sentiment Analysis. This repository contains source code for paper @ LJoM journal titled as: Large Language Models in Targeted Sentiment Analysis for Russian.

Overview in Russian Language:

YouTube

Survey in English:

YouTube

Contents

Installation

We separate dependencies necessary for zero-shot and fine-tuning experiments:

pip install -r dependencies_zs.txt
pip install -r dependencies_ft.txt

Preparing Data

Simply launch the following script for obtaining both original texts and Translated:

python rusentne23_download.py

Manual Data Translation

You could launch manual data translation to English language (en) via GoogleTrans:

 python rusentne23_translate.py --src "data/train_data.csv" --lang "en" --label
 python rusentne23_translate.py --src "data/valid_data.csv" --lang "en" --label
 python rusentne23_translate.py --src "data/final_data.csv" --lang "en"

Zero-Shot

Open In Colab

This is a common script for launching LLM model inference in Zero-shot format using manual or predefined prompts:

python zero_shot_infer.py \
    --model "google/flan-t5-base" \
    --src "data/final_data_en.csv" \
    --prompt "rusentne2023_default_en" \
    --device "cpu" \
    --to "csv" \
    --temp 0.1 \
    --output "data/output.csv" \
    --max-length 512 \
    --hf-token "<YOUR_HUGGINGFACE_TOKEN>" \
    --openai-token "<YOUR_OPENAI_TOKEN>" \
    --limit 10000 \
    --limit-prompt 10000 \
    --bf16 \
    --l4b

Notes

Usage Examples

Chat mode

Simply setup model name and device you wish to use for launching model.

python zero_shot_infer.py --model google/flan-t5-base --device cpu

Inference with the predefined prompt

Use the prompt command for passing the predefined prompt or textual prompt that involves the {text} information.

python zero_shot_infer.py --model google/flan-t5-small \
    --device cpu --src data/final_data_en.csv --prompt 'rusentrel2023_default_en'

OpenAI models

Use the model parameter prefixed by openai:, followed by model names as follows:

python zero_shot_infer.py --model "openai:gpt-3.5-turbo-1106" \
    --src "data/final_data_en.csv" --prompt "rusentrel2023_default_en_short" \
    --max-length 75 --limit 5

Zero-Shot Chain-of-Thought

This functionality if out-of-scope of this repository.

We release a tiny framework, dubbed as quick_cot for applying CoT schemas, with API similar to one in Zero-Shot section, based on schemas written in JSON notation.

Fine-tuned Flan-T5

👉 Prompt-Fine-Tuning Logs

👉 THoR-Fine-Tuning Logs

Model prompt THoR
FlanT5-base - 🤗 nicolay-r/flan-t5-tsa-thor-base
FlanT5-large - 🤗 nicolay-r/flan-t5-tsa-thor-large
FlanT5-xl 🤗 nicolay-r/flan-t5-tsa-prompt-xl 🤗 nicolay-r/flan-t5-tsa-prompt-xl

Three Hop Chain-of-Thought THoR

Open In Colab

python thor_finetune.py -r "thor" -d "rusentne2023" 
    -m "google/flan-t5-base" \
    -li <PRETRAINED_STATE_INDEX> \
    -bs <BATCH_SIZE> \
    -es <EPOCH_SIZE> \
    -f "./config/config.yaml" 

Parameters list

  • -c, --cuda_index: Index of the GPU to use for computation (default: 0).
  • -m, --model_path: Path to the model on hugging face.
  • -d, --data_name: Name of the dataset (rusentne2023)
  • -r, --reasoning: Specifies the reasoning mode (engine), with single prompt or multi-step thor mode.
  • -li, --load_iter: load a state on specific index from the same data_name resource (default: -1, not applicable.)
  • -es, --epoch_size: amount of training epochs (default: 1)
  • -bs, --batch_size: size of the batch (default: None)
  • -t, --temperature: temperature (default=gen_config.temperature)
  • -z, --zero_shot: running zero-shot inference with chosen engine on test dataset to form answers.
  • -f, --config: Specifies the location of config.yaml file.

Configure more parameters in config.yaml file.

Answers

Results of the zero-shot models obtained during experiments fall outside the scope of this repository. We open a separate for LLM-resonses, including answers on reasoning steps in THoR CoT for ChatGPT model series:

References

You can cite this work as follows:

@misc{rusnachenko2024large,
      title={Large Language Models in Targeted Sentiment Analysis}, 
      author={Nicolay Rusnachenko and Anton Golubev and Natalia Loukachevitch},
      year={2024},
      eprint={2404.12342},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}