/nlp-template

Natural Language Processing Template

Primary LanguagePythonMIT LicenseMIT

NLP Template

Customizable NLP Model Development Template

This customizable template for natural language processing (NLP) tasks. With simple modifications in a single config.py file, it enables the development of various NLP models. The primary goal is to offer a starting point for developers working in the NLP domain.

Usage

Check the example usage of nlp-template on this notebook

Install

Clone the project to your local environment:

git clone https://github.com/aliosmankaya/nlp-template.git

Install the necessary dependencies:

pip install -r requirements.txt

Data Format

  • Data must be a CSV file
  • Data must have 2 columns:
    • text: Text body (str)
    • labels: Text labels (int)

Customize

  • task: Problem task (str, defaults to "binary")
  • num_labels: Number of labels (int, defaults to 2)
  • model_path: Pretrained model path (str, defaults to "bert-base-uncased")
  • epochs: Epochs number (int, defaults to 3)
  • batch_size: Batch size (int, defaults to 8)
  • max_length: Maximum length of the sequence (int, defaults to 200)
  • learning_rate: Learning rate (float, defaults to 5e-5)
  • seed: Seed number (int, defaults to 42)
  • device: Device type (str, defaults "cuda" or "cpu")
  • val_strategy: Validation strategy (func, defaults to StratifiedKFold)
  • n_splits: Fold number (int, defaults to 5)
  • metric: Metric (func, defaults to F1Score)

Train

To initiate the training phase, use the following command:

python3 train.py --path PATH
  • PATH : Train data path

Inference

Coming soon...