Using dynamic data augmentation for Sentiment Analysis on Hinglish Code-Switched text

Refer to report.pdf for details about the project.

Setup instructions

Note: Changing the directory structure of 'cslm' package might break poetry scripts.

Using poetry

pip install poetry
poetry install

To create dataset splits for,

While training on different datasets, make sure to change num_classes in config.json and update the dict labels2num in dataset.py.

To train file, make appropriate changes to config file (Example config file at config.json) and run,

poetry run train --config <config-json>

To test file, it is easiest keep the config file (Example config file at config.json) the same as during training and run,

poetry run test --config <config-json>

Authorization for wandb,

poetry run python -m wandb login

To enable wandb, set os.environ["WANDB_MODE"] = "online"

if the online mode fails on the cluster, run it in offline mode os.environ["WANDB_MODE"] = "offline", then sync it separately using,

wandb sync <wandb-run-path>