Trainer.train() giving me Key Error: [random number]
fishroll23 opened this issue · 3 comments
System Info
peft == 0.10.0
transformers==4.40.2
python 3.10.11
Code
from peft import LoraConfig, TaskType, get_peft_model
from transformers import AutoTokenizer, AutoModelForSequenceClassification, TrainingArguments, Trainer
from sklearn.metrics import accuracy_score, precision_recall_fscore_support
from sklearn.model_selection import train_test_split
from transformers.data.data_collator import default_data_collator
import pandas as pd
df = pd.read_csv('data.csv')
train_df, eval_df = train_test_split(df, test_size=0.2, random_state=1)
train_df.reset_index(inplace=True, drop=True)
eval_df.reset_index(inplace=True, drop=True)
peft_config = LoraConfig(task_type=TaskType.SEQ_CLS,
inference_mode=False,
r=8,
lora_alpha=32,
lora_dropout=0.1)
tokenizer = AutoTokenizer.from_pretrained("alibaba-pai/pai-bert-base-zh-llm-risk-detection")
model = AutoModelForSequenceClassification.from_pretrained("alibaba-pai/pai-bert-base-zh-llm-risk-detection")
model = get_peft_model(model, peft_config)
model.print_trainable_parameters()
training_args = TrainingArguments(
output_dir="toxic_detect/pai-bert-base-zh-llm-risk-detection-lora",
learning_rate=1e-3,
per_device_train_batch_size=32,
per_device_eval_batch_size=32,
num_train_epochs=2,
weight_decay=0.01,
evaluation_strategy="epoch",
save_strategy="epoch",
load_best_model_at_end=True,
)
def compute_metrics(pred):
labels = pred.label_ids
preds = pred.predictions.argmax(-1)
precision, recall, f1, _ = precision_recall_fscore_support(labels, preds, average='binary')
acc = accuracy_score(labels, preds)
return {
'accuracy': acc,
'f1': f1,
'precision': precision,
'recall': recall
}
trainer = Trainer(
model=model,
args=training_args,
train_dataset=train_df,
eval_dataset=eval_df,
tokenizer=tokenizer,
data_collator=default_data_collator,
compute_metrics=compute_metrics,
)
trainer.train()
model.save_pretrained("output_dir")
And here’s the code leading up to the error:
`/Users/mac299/anaconda3/envs/pythonProject1/venv/bin/python /Users/mac299/anaconda3/envs/pythonProject1/train.py
/Users/mac299/anaconda3/envs/pythonProject1/venv/lib/python3.10/site-packages/bitsandbytes/cextension.py:34: UserWarning: The installed version of bitsandbytes was compiled without GPU support. 8-bit optimizers, 8-bit multiplication, and GPU quantization are unavailable.
warn("The installed version of bitsandbytes was compiled without GPU support. "
'NoneType' object has no attribute 'cadam32bit_grad_fp32'
/Users/mac299/anaconda3/envs/pythonProject1/venv/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
warnings.warn(
trainable params: 298,757 || all params: 102,570,250 || trainable%: 0.29127061696739553
0%| | 0/912 [00:00<?, ?it/s]Traceback (most recent call last):
File "/Users/mac299/anaconda3/envs/pythonProject1/venv/lib/python3.10/site-packages/pandas/core/indexes/base.py", line 3805, in get_loc
return self._engine.get_loc(casted_key)
File "index.pyx", line 167, in pandas._libs.index.IndexEngine.get_loc
File "index.pyx", line 196, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/hashtable_class_helper.pxi", line 7081, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas/_libs/hashtable_class_helper.pxi", line 7089, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 870
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/Users/mac299/anaconda3/envs/pythonProject1/train.py", line 58, in <module>
trainer.train()
File "/Users/mac299/anaconda3/envs/pythonProject1/venv/lib/python3.10/site-packages/transformers/trainer.py", line 1859, in train
return inner_training_loop(
File "/Users/mac299/anaconda3/envs/pythonProject1/venv/lib/python3.10/site-packages/transformers/trainer.py", line 2165, in _inner_training_loop
for step, inputs in enumerate(epoch_iterator):
File "/Users/mac299/anaconda3/envs/pythonProject1/venv/lib/python3.10/site-packages/accelerate/data_loader.py", line 454, in __iter__
current_batch = next(dataloader_iter)
File "/Users/mac299/anaconda3/envs/pythonProject1/venv/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 631, in __next__
data = self._next_data()
File "/Users/mac299/anaconda3/envs/pythonProject1/venv/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 675, in _next_data
data = self._dataset_fetcher.fetch(index) # may raise StopIteration
File "/Users/mac299/anaconda3/envs/pythonProject1/venv/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 51, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/Users/mac299/anaconda3/envs/pythonProject1/venv/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 51, in <listcomp>
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/Users/mac299/anaconda3/envs/pythonProject1/venv/lib/python3.10/site-packages/pandas/core/frame.py", line 4102, in __getitem__
indexer = self.columns.get_loc(key)
File "/Users/mac299/anaconda3/envs/pythonProject1/venv/lib/python3.10/site-packages/pandas/core/indexes/base.py", line 3812, in get_loc
raise KeyError(key) from err
KeyError: 870
0%| | 0/912 [00:00<?, ?it/s]
I’ve tried a few different tutorials on HuggingFace, but so far no progress on this one.
I’m working on PyCharm, if it’s relevant. Please let me know if there’s any other information I could give you that would help with diagnosis.
Thanks for reading!
I'm pretty sure that the issue you see is not PEFT-related. If you remove the PEFT part, I expect the same type of error to occur. Most likely, the issue is that you try to use Trainer
with a pandas DataFrame
. I'm not super knowledgeable on Trainer
, but I would be surprised if that worked. Check the docs for this argument.
If you have tabular data, I don't think AutoModelForSequenceClassification
is a good fit anyway. If it's pure text data, I wouldn't load it as a DataFrame
.
Thanks for answer, I try to use datasets.Dataset instead of Dataframe in Trainer and figure it out.
Good luck. I'll close the issue for now, as it seems to be unrelated to PEFT. Feel free to re-open if you encounter a PEFT error.