cross lingual

Question

cross lingual

Closed this issue 6 months ago · 8 comments

mars203030 commented 7 months ago

Hi

can I use this metric from cross lingual evaluation?

Answer 1 · 2024-04-27T08:31:26.000Z

Hi @mars203030,

No, ROUGE requires the reference and prediction text to be in the same language. For cross-lingual evaluation, you can look at this metric.

Answer 2 · 2024-04-29T02:11:14.000Z

thank you soo much I will try it

but for your rouge library I have this error , another question which arabic stemmer do I need to install

`
ImportError Traceback (most recent call last)
Cell In[15], line 3
1 import sys
2 sys.path.append('/multilingual_rouge_scoring')
----> 3 from multilingual_rouge_scoring import rouge_scorer
6 scorer = rouge_scorer.RougeScorer(['rouge1', 'rougeL'], use_stemmer=True, lang="arabic")
8 scores = scorer.get_scores(conversation_ar, ar_note)

File ~/Downloads/Visualization/NeuroNLP/attempt3/multilingual_rouge_scoring/rouge_scorer.py:37
35 from six.moves import range
36 from rouge_score import scoring
---> 37 from rouge_score import tokenization_wrapper as tokenize
38 import pyonmttok
39 import collections

ImportError: cannot import name 'tokenization_wrapper' from 'rouge_score'
`

Answer 3 · 2024-04-29T08:51:09.000Z

1 import sys
2 sys.path.append('/multilingual_rouge_scoring')
----> 3 from multilingual_rouge_scoring import rouge_scorer
6 scorer = rouge_scorer.RougeScorer(['rouge1', 'rougeL'], use_stemmer=True, lang="arabic")
8 scores = scorer.get_scores(conversation_ar, ar_note)

This is not how you are supposed to use this library. First, install the package with the instructions given here. Then follow these examples on how to use this package with python.

another question which arabic stemmer do I need to install

This repo uses the NLTK SnowballStemmer module for Arabic. It'll be installed automatically when you install our package.

Answer 4 · 2024-04-29T13:28:05.000Z

I am still having an issue here

here is my installation

and this is the code an the error

Answer 5 · 2024-04-29T14:25:33.000Z

Your notebook isn't running in the same environment where you installed the package. I've replicated the correct workflow in this colab notebook. Please follow this.

Answer 6 · 2024-04-29T19:02:47.000Z

Thanks , I restarted the kernal and it is working fine
I have questions regrding LASE it is working I am comparing english text(reference) and arabic text (predicted)

I would like to know what is a good lase score what it the range
below is my result for one input
'
from LaSE import LaSEScorer
scorer = LaSEScorer()

score = scorer.score(
clinical_note,
conversation_ar,
# language name of the reference text
)

print(score)'

LaSEResult(ms=0.6220683, lc=1.0, lp=1.0, LaSE=0.6220682859420776)

2)if I define the target_lang I receive this error ValueError: predict processes one line at a time (remove '\n')
3) is there a max length my generated text is around 4000 word

for the rouge score also is there a max length?
there is a minimal difference in the results for the english text summarization when I use the original google package and the multilingual package . how is the difference is explained:
google : {'rouge1': Score(precision=0.37389380530973454, recall=0.49852507374631266, fmeasure=0.42730720606826805), 'rouge2': Score(precision=0.07982261640798226, recall=0.10650887573964497, fmeasure=0.09125475285171102), 'rougeL': Score(precision=0.17699115044247787, recall=0.2359882005899705, fmeasure=0.202275600505689), 'rougeLsum': Score(precision=0.32964601769911506, recall=0.4421364985163205, fmeasure=0.37769328263624846)}

MLRouge: English : {'rouge1': Score(precision=0.3893805309734513, recall=0.5191740412979351, fmeasure=0.44500632111251587), 'rouge2': Score(precision=0.08869179600886919, recall=0.11834319526627218, fmeasure=0.10139416983523449), 'rougeL': Score(precision=0.18584070796460178, recall=0.24778761061946902, fmeasure=0.21238938053097345), 'rougeLsum': Score(precision=0.33849557522123896, recall=0.4540059347181009, fmeasure=0.3878326996197719)}

Regards

Answer 7 · 2024-05-01T16:01:59.000Z

I would like to know what is a good lase score what it the range

The value range for LaSE is [0, 1]. In general, we found good summaries to have LaSE score > 0.5.

if I define the target_lang I receive this error ValueError: predict processes one line at a time (remove '\n')

The target evaluation domain of this metric was short, single-line summaries. Therefore, as indicated by the error, you'd need to make sure your reference and prediction texts don't contain new lines.

is there a max length my generated text is around 4000 word

The embedding model behind LaSE, namely LaBSE, only supports sequences up to 512 tokens.

for the rouge score also is there a max length?

No.

there is a minimal difference in the results for the english text summarization when I use the original google package and the multilingual package . how is the difference is explained:

The difference is in the tokenization, stemming and character filtering policies. For example, the google package removes all non-alphanumeric characters and applies stemming when token length exceeds a threshold, which we don't do to enable multilingual evaluation. Please see both implementations to get a better idea of all the differences.

Answer 8 · 2024-05-17T14:41:25.000Z

Thank you very much for your generous reply and patience.