g8a9/ferret

Dependency Issue

Opened this issue · 2 comments

  • ferret version: 0.4.1
  • Python version: 3.10.12
  • Running on : Google Colab

Description

I'm trying to run :

bench.show_table(explanations)

and I get the following error :

AttributeError                            Traceback (most recent call last)
[<ipython-input-9-39a3816ba913>](https://localhost:8080/#) in <cell line: 1>()
----> 1 bench.show_table(explanation2)

[/usr/local/lib/python3.10/dist-packages/ferret/benchmark.py](https://localhost:8080/#) in show_table(self, explanations, apply_style, remove_first_last)
    396             table.columns = pd.io.parsers.base_parser.ParserBase(
    397                 {"names": table.columns, "usecols": None}
--> 398             )._maybe_dedup_names(table.columns)
    399 
    400         return (

AttributeError: 'ParserBase' object has no attribute '_maybe_dedup_names'

This only happens when I have duplicate tokens in my text example which I'm willing to explain, apparently in The latest version of pandas (2.2.2) _maybe_dedup_names is deprecated, probably to _maybe_make_multi_index_columns.
I have tried to downgrade pandas yet new errors occurs, Can you provide me with the pandas version used here ?

Thanks.

g8a9 commented

Hey, thank you for reaching out. Yes indeed, it seems related to the deprecation of that pandas method. I believe the best way here is to rename duplicated columns (tokens) ourselves (i.e., not relying on pandas for that) -- but I would include the change in a new library release.

Before doing that, can you share the google colab or snippet of code that crashes on your side?

Hi @g8a9, Thanks for your reply.
This code is just from the doc but text example has duplicate tokens :

from transformers import AutoModelForSequenceClassification, AutoTokenizer
from ferret import Benchmark

name = "cardiffnlp/twitter-xlm-roberta-base-sentiment"
model = AutoModelForSequenceClassification.from_pretrained(name)
tokenizer = AutoTokenizer.from_pretrained(name)

bench = Benchmark(model, tokenizer)

explanations = bench.explain("You look stunning stunning !", target=1)
evaluations = bench.evaluate_explanations(explanations, target=1)

bench.show_table(explanations)
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
[<ipython-input-3-1ca62c9897f7>](https://localhost:8080/#) in <cell line: 1>()
----> 1 bench.show_table(explanations)

[/usr/local/lib/python3.10/dist-packages/ferret/benchmark.py](https://localhost:8080/#) in show_table(self, explanations, apply_style, remove_first_last)
    396             table.columns = pd.io.parsers.base_parser.ParserBase(
    397                 {"names": table.columns, "usecols": None}
--> 398             )._maybe_dedup_names(table.columns)
    399 
    400         return (

AttributeError: 'ParserBase' object has no attribute '_maybe_dedup_names'