asahi417/tner

Building docker image

tednaseri opened this issue · 2 comments

Hi @asahi417, in one of my projects, I am using different NLP models including NER, sentiment, and other text classifiers.
The models are all based on PyTorch. Currently, need to make one docker image supporting all models. Except for T-NER, for the rest, I could make a docker image, and it works smoothly. But to build docker for T-NER I have some issues since it is on top of
Allen-NLP and also needs some GCC as well.
It would be great if you can provide some guides for this task.

What approaches do you recommend using to build the image for the T_NER model?
What base image do you recommend? python base image, nvidia base image, torch base image or etc?
Do you recommend using conda env in Dockerfile?

Some details:
interested model: https://huggingface.co/tner/twitter-roberta-base-dec2021-tweetner7-all
Some requirements: python==3.9 torch==1.12.0+cu113 torchvision==0.13.0+cu113 torchaudio==0.12.0

The reason why we need AllenNLP in TNER is because AllenNLP provides the CRF, which is a specific layer on top of the output of language model and not supported by transformers library at the moment. The CRF layer is usually fine-tuned together with the language model itself, and improve the accuracy by a few points in F1 score, but in fact you can use any TNER models without CRF via transformers as blow.

from transformers import AutoTokenizer, AutoModelForTokenClassification
tokenizer = AutoTokenizer.from_pretrained("tner/twitter-roberta-base-dec2021-tweetner7-all")
model = AutoModelForTokenClassification.from_pretrained("tner/twitter-roberta-base-dec2021-tweetner7-all”)

Or simply with pipeline.

from transformers import pipeline
pipe = pipeline('token-classification', model='tner/twitter-roberta-base-dec2021-tweetner7-all')
pipe("Jacob Collier is an English artist from London”)

That way, you don’t even need to install tner, and all you need is transformers library. The drawback is that now the model don’t use CRF so the performance could slightly drop. I haven’t done any experiments to measure the performance of the models without CRF, so I can’t tell how much it would degrade. You may better test it locally with some test sentences in your use case first.

Thank you so much @asahi417 for the response. I see why AllenNLP is used here.
That would be a nice exploration to see how it works without the CRF layer.

By the way, suppose someone needs the CRF layer too. In that case, what do you recommend for building the docker image? Do you have any guidance for the Dockerfile?

thank you :)