allenai/scibert

enhancement - plug in to bert-as-service

Closed this issue · 1 comments

Actually I think all you need to do @johndpope is set pooling_layer = -1. I tried using sciBERT directly with BaaS and the performance anecdotally was worse than BERTlarge. Then I remembered that BaaS defaults to run inferences from the second to last layer because the last year is task-specific. However, this isn't relevant for sciBERT so you can just set the pooling_layer to -1 and again, anecdotally, the results are as you'd expect better inferencing of scientific vocabulary.

E.g. I have a customised example8.py that uses cosine similarity. I have a series of tests that check things like terms that are related through definition (in this case BTX and toluene and xylenes).

Question: "BTX aromatics", with sciBERT, target sentence match was ranked 2, while with BERTlarge it was ranked 5.