huggingface/peft

Trainer.train() giving me Key Error: [random number]

fishroll23 opened this issue · 3 comments

System Info

peft == 0.10.0
transformers==4.40.2
python 3.10.11

Code

from peft import LoraConfig, TaskType, get_peft_model
from transformers import AutoTokenizer, AutoModelForSequenceClassification, TrainingArguments, Trainer
from sklearn.metrics import accuracy_score, precision_recall_fscore_support
from sklearn.model_selection import train_test_split
from transformers.data.data_collator import default_data_collator
import pandas as pd

df = pd.read_csv('data.csv')
train_df, eval_df = train_test_split(df, test_size=0.2, random_state=1)
train_df.reset_index(inplace=True, drop=True)
eval_df.reset_index(inplace=True, drop=True)

peft_config = LoraConfig(task_type=TaskType.SEQ_CLS,
                         inference_mode=False,
                         r=8,
                         lora_alpha=32,
                         lora_dropout=0.1)

tokenizer = AutoTokenizer.from_pretrained("alibaba-pai/pai-bert-base-zh-llm-risk-detection")
model = AutoModelForSequenceClassification.from_pretrained("alibaba-pai/pai-bert-base-zh-llm-risk-detection")

model = get_peft_model(model, peft_config)
model.print_trainable_parameters()

training_args = TrainingArguments(
    output_dir="toxic_detect/pai-bert-base-zh-llm-risk-detection-lora",
    learning_rate=1e-3,
    per_device_train_batch_size=32,
    per_device_eval_batch_size=32,
    num_train_epochs=2,
    weight_decay=0.01,
    evaluation_strategy="epoch",
    save_strategy="epoch",
    load_best_model_at_end=True,
)
def compute_metrics(pred):
    labels = pred.label_ids
    preds = pred.predictions.argmax(-1)
    precision, recall, f1, _ = precision_recall_fscore_support(labels, preds, average='binary')
    acc = accuracy_score(labels, preds)
    return {
        'accuracy': acc,
        'f1': f1,
        'precision': precision,
        'recall': recall
    }

trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=train_df,
    eval_dataset=eval_df,
    tokenizer=tokenizer,
    data_collator=default_data_collator,
    compute_metrics=compute_metrics,
)

trainer.train()

model.save_pretrained("output_dir")

And here’s the code leading up to the error:

`/Users/mac299/anaconda3/envs/pythonProject1/venv/bin/python /Users/mac299/anaconda3/envs/pythonProject1/train.py 
/Users/mac299/anaconda3/envs/pythonProject1/venv/lib/python3.10/site-packages/bitsandbytes/cextension.py:34: UserWarning: The installed version of bitsandbytes was compiled without GPU support. 8-bit optimizers, 8-bit multiplication, and GPU quantization are unavailable.
  warn("The installed version of bitsandbytes was compiled without GPU support. "
'NoneType' object has no attribute 'cadam32bit_grad_fp32'
/Users/mac299/anaconda3/envs/pythonProject1/venv/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
trainable params: 298,757 || all params: 102,570,250 || trainable%: 0.29127061696739553
  0%|          | 0/912 [00:00<?, ?it/s]Traceback (most recent call last):
  File "/Users/mac299/anaconda3/envs/pythonProject1/venv/lib/python3.10/site-packages/pandas/core/indexes/base.py", line 3805, in get_loc
    return self._engine.get_loc(casted_key)
  File "index.pyx", line 167, in pandas._libs.index.IndexEngine.get_loc
  File "index.pyx", line 196, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/hashtable_class_helper.pxi", line 7081, in pandas._libs.hashtable.PyObjectHashTable.get_item
  File "pandas/_libs/hashtable_class_helper.pxi", line 7089, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 870

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/mac299/anaconda3/envs/pythonProject1/train.py", line 58, in <module>
    trainer.train()
  File "/Users/mac299/anaconda3/envs/pythonProject1/venv/lib/python3.10/site-packages/transformers/trainer.py", line 1859, in train
    return inner_training_loop(
  File "/Users/mac299/anaconda3/envs/pythonProject1/venv/lib/python3.10/site-packages/transformers/trainer.py", line 2165, in _inner_training_loop
    for step, inputs in enumerate(epoch_iterator):
  File "/Users/mac299/anaconda3/envs/pythonProject1/venv/lib/python3.10/site-packages/accelerate/data_loader.py", line 454, in __iter__
    current_batch = next(dataloader_iter)
  File "/Users/mac299/anaconda3/envs/pythonProject1/venv/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 631, in __next__
    data = self._next_data()
  File "/Users/mac299/anaconda3/envs/pythonProject1/venv/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 675, in _next_data
    data = self._dataset_fetcher.fetch(index)  # may raise StopIteration
  File "/Users/mac299/anaconda3/envs/pythonProject1/venv/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 51, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/Users/mac299/anaconda3/envs/pythonProject1/venv/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 51, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/Users/mac299/anaconda3/envs/pythonProject1/venv/lib/python3.10/site-packages/pandas/core/frame.py", line 4102, in __getitem__
    indexer = self.columns.get_loc(key)
  File "/Users/mac299/anaconda3/envs/pythonProject1/venv/lib/python3.10/site-packages/pandas/core/indexes/base.py", line 3812, in get_loc
    raise KeyError(key) from err
KeyError: 870
  0%|          | 0/912 [00:00<?, ?it/s]

I’ve tried a few different tutorials on HuggingFace, but so far no progress on this one.

I’m working on PyCharm, if it’s relevant. Please let me know if there’s any other information I could give you that would help with diagnosis.

Thanks for reading!

I'm pretty sure that the issue you see is not PEFT-related. If you remove the PEFT part, I expect the same type of error to occur. Most likely, the issue is that you try to use Trainer with a pandas DataFrame. I'm not super knowledgeable on Trainer, but I would be surprised if that worked. Check the docs for this argument.

If you have tabular data, I don't think AutoModelForSequenceClassification is a good fit anyway. If it's pure text data, I wouldn't load it as a DataFrame.

Thanks for answer, I try to use datasets.Dataset instead of Dataframe in Trainer and figure it out.

Good luck. I'll close the issue for now, as it seems to be unrelated to PEFT. Feel free to re-open if you encounter a PEFT error.