SigmaWe/DocAsRef

Vectorizing MNLI inference

forrestbao opened this issue · 4 comments

The two segments of code below for MNLI is too slow. Should use vectorized version to speed up.

The approach below computes a pair of sentences each time. It is too slow. Please see whether you can find an API that computes causality between multiple pairs each time.

Huggingface's zero-shot classification task can do it. See my example at the end.

https://github.com/SigmaWe/DocAsRef_0/blob/de4de4b4275e661621bebf3b2f92d8676e2f81c2/mnli/sim.py#L10-L16

https://github.com/SigmaWe/DocAsRef_0/blob/de4de4b4275e661621bebf3b2f92d8676e2f81c2/mnli/eval.py#L22-L26

In [1]: from transformers import pipeline

In [2]: classifier = pipeline("zero-shot-classification",
   ...:                       model="facebook/bart-large-mnli")
Downloading: 100%|███████████████████████████████████████████████████████████████████████████████████████| 1.13k/1.13k [00:00<00:00, 950kB/s]
Downloading: 100%|██████████████████████████████████████████████████████████████████████████████████████| 1.52G/1.52G [01:18<00:00, 20.8MB/s]
Downloading: 100%|████████████████████████████████████████████████████████████████████████████████████████| 26.0/26.0 [00:00<00:00, 16.7kB/s]
Downloading: 100%|████████████████████████████████████████████████████████████████████████████████████████| 878k/878k [00:00<00:00, 2.36MB/s]
Downloading: 100%|████████████████████████████████████████████████████████████████████████████████████████| 446k/446k [00:00<00:00, 1.34MB/s]
Downloading: 100%|██████████████████████████████████████████████████████████████████████████████████████| 1.29M/1.29M [00:00<00:00, 2.92MB/s]

In [3]: sequence_to_classify = ["one day I will see the world", "i love swing dance"]

In [5]: candidate_labels = ['This blog is about summer.', 'This is my Friday night plan.']
   ...: classifier(sequence_to_classify, candidate_labels)

Out[5]: 
[{'sequence': 'one day I will see the world',
  'labels': ['This blog is about summer.', 'This is my Friday night plan.'],
  'scores': [0.7098779678344727, 0.2901219427585602]},
 {'sequence': 'i love swing dance',
  'labels': ['This is my Friday night plan.', 'This blog is about summer.'],
  'scores': [0.6118907332420349, 0.3881092965602875]}]

and, in the final paper, we can show results using different LMs. BART-MNLI is one and original RoBERTA-MNLI is also one.

TURX commented

The zero-shot one gives a lot different result than the text-classification task even with the same labels. I will show you on tomorrow's meeting.

Maybe the reason is because the base model changes from RoBERTa to BART. Maybe we should use a RoBERTa-based model to be fair. https://huggingface.co/roberta-large-mnli

Ref: #10

so per the discussion this afternoon, we will just vectorize this code below and forget about the zero-shot approach which seems to have issue with long sentences.

https://github.com/SigmaWe/DocAsRef_0/blob/de4de4b4275e661621bebf3b2f92d8676e2f81c2/mnli/sim.py#L10-L16