Dependency Issue

Question

Dependency Issue

Opened this issue 5 months ago · 2 comments

ferret version: 0.4.1
Python version: 3.10.12
Running on : Google Colab

Description

I'm trying to run :

bench.show_table(explanations)

and I get the following error :

AttributeError                            Traceback (most recent call last)
[<ipython-input-9-39a3816ba913>](https://localhost:8080/#) in <cell line: 1>()
----> 1 bench.show_table(explanation2)

[/usr/local/lib/python3.10/dist-packages/ferret/benchmark.py](https://localhost:8080/#) in show_table(self, explanations, apply_style, remove_first_last)
    396             table.columns = pd.io.parsers.base_parser.ParserBase(
    397                 {"names": table.columns, "usecols": None}
--> 398             )._maybe_dedup_names(table.columns)
    399 
    400         return (

AttributeError: 'ParserBase' object has no attribute '_maybe_dedup_names'

This only happens when I have duplicate tokens in my text example which I'm willing to explain, apparently in The latest version of pandas (2.2.2) _maybe_dedup_names is deprecated, probably to _maybe_make_multi_index_columns.
I have tried to downgrade pandas yet new errors occurs, Can you provide me with the pandas version used here ?

Thanks.

Answer 1 · 2024-04-26T09:22:19.000Z

Hey, thank you for reaching out. Yes indeed, it seems related to the deprecation of that pandas method. I believe the best way here is to rename duplicated columns (tokens) ourselves (i.e., not relying on pandas for that) -- but I would include the change in a new library release.

Before doing that, can you share the google colab or snippet of code that crashes on your side?

Answer 2 · 2024-04-26T11:57:14.000Z

Hi @g8a9, Thanks for your reply.
This code is just from the doc but text example has duplicate tokens :

from transformers import AutoModelForSequenceClassification, AutoTokenizer
from ferret import Benchmark

name = "cardiffnlp/twitter-xlm-roberta-base-sentiment"
model = AutoModelForSequenceClassification.from_pretrained(name)
tokenizer = AutoTokenizer.from_pretrained(name)

bench = Benchmark(model, tokenizer)

explanations = bench.explain("You look stunning stunning !", target=1)
evaluations = bench.evaluate_explanations(explanations, target=1)

bench.show_table(explanations)

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
[<ipython-input-3-1ca62c9897f7>](https://localhost:8080/#) in <cell line: 1>()
----> 1 bench.show_table(explanations)

[/usr/local/lib/python3.10/dist-packages/ferret/benchmark.py](https://localhost:8080/#) in show_table(self, explanations, apply_style, remove_first_last)
    396             table.columns = pd.io.parsers.base_parser.ParserBase(
    397                 {"names": table.columns, "usecols": None}
--> 398             )._maybe_dedup_names(table.columns)
    399 
    400         return (

AttributeError: 'ParserBase' object has no attribute '_maybe_dedup_names'