Execution of example from the Using the evaluator docs fails due to unspecified tokenizer
jpodivin opened this issue · 0 comments
jpodivin commented
Instead of calculating metrics, the first example of evaluation[1] fails since the tokenizer isn't provided nor inferred.
Exception: Impossible to guess which tokenizer to use. Please provide a PreTrainedTokenizer class or a path/identifier to a pretrained tokenizer.
To replicate, simply try to execute following:
from datasets import load_dataset
from evaluate import evaluator
from transformers import AutoModelForSequenceClassification, pipeline
data = load_dataset("imdb", split="test").shuffle(seed=42).select(range(1000))
task_evaluator = evaluator("text-classification")
# 1. Pass a model name or path
eval_results = task_evaluator.compute(
model_or_pipeline="lvwerra/distilbert-imdb",
data=data,
label_mapping={"NEGATIVE": 0, "POSITIVE": 1}
)
# 2. Pass an instantiated model
model = AutoModelForSequenceClassification.from_pretrained("lvwerra/distilbert-imdb")
eval_results = task_evaluator.compute(
model_or_pipeline=model,
data=data,
label_mapping={"NEGATIVE": 0, "POSITIVE": 1}
)
evaluate===0.4.1