Target is multiclass but average='binary'. Please choose another average setting, one of [None, 'micro', 'macro', 'weighted'].
MuvvaThriveni opened this issue · 5 comments
I am facing an issue when i am trying to build multiclass classification model
here is my code from starting
import pandas as pd
data=pd.read_csv('/content/Normalized_Data_PBLD.csv')
y=data['label'].tolist()
X_train, X_test, y_train, y_test = train_test_split(data['comment'].tolist(), y, random_state=5, test_size=0.2) #train, test split
#validation split
from sklearn.utils import class_weight
class_weights = class_weight.compute_class_weight(class_weight="balanced",
classes=np.unique(y_train),
y=y_train)
X_train, X_val, y_train, y_val = train_test_split(X_train, y_train, random_state=5, test_size=0.1)
list_of_class={'NEG':0,'NTL':1,'POS':2}
y_val=[list_of_class[i.strip()]for i in y_val]
y_train=[list_of_class[i.strip()]for i in y_train]
y_test=[list_of_class[i.strip()]for i in y_test]
d1 = {'comment': X_train, 'label': y_train}
df_train = pd.DataFrame(d1)
d2 = {'comment': X_val, 'label': y_val}
df_val = pd.DataFrame(d2)
d3 = {'comment': X_test, 'label': y_test}
df_test = pd.DataFrame(d3)
calling bert model
model = ClassificationModel('bert', 'bert-base-multilingual-cased', num_labels=3, args={'learning_rate':1e-5, 'num_train_epochs': 2, 'reprocess_input_data': True, 'overwrite_output_dir': True})
model.train_model(df_train)
result, model_outputs, wrong_predictions = model.eval_model(df_val)
when running this line facing below error
ERROR:
ValueError: Target is multiclass but average='binary'. Please choose another average setting, one of [None, 'micro', 'macro', 'weighted'].
n tried another way
from sklearn.metrics import f1_score, accuracy_score
def f1_multiclass(labels, preds):
return f1_score(labels, preds, average='weighted')
result, model_outputs, wrong_predictions = model.eval_model(df_val, f1=f1_multiclass, acc=accuracy_score)
even though same error
ValueError: Target is multiclass but average='binary'. Please choose another average setting, one of [None, 'micro', 'macro', 'weighted'].
anyone please help to solve
I am also currently encountering a similar ValueError [1] while evaluating the model for multi-class classification using the DistilBERT model.
As a workaround, I have attempted the direct computation of the F1 score outside of the eval_model method.
[1] Error
ValueError: Target is multiclass but average='binary'. Please choose another average setting, one of [None, 'micro', 'macro', 'weighted'].
[2] Workaround (tried the direct computation of the F1 score outside of the eval_model method)
from sklearn.metrics import f1_score
import numpy as np
# This part depends on how your model outputs predictions
predictions, raw_outputs = model.predict(valid_df['text'].tolist())
# This assumes `valid_df['labels']` contains the true class labels for each sample
true_labels = valid_df['labels'].values
# Calculate F1 Score
f1 = f1_score(true_labels, predictions, average='weighted')
print(f"Weighted F1 Score: {f1}")
It would be greatly appreciated if someone could provide input regarding this issue.
I'm downgrading to 0.64.3