makcedward/nlpaug

Unable to load a custom model

Closed this issue · 3 comments

Hi, I trained a model and it is unable to load it. I think this might get solved if we can pass AutoModel and AutoTokenizer objects using transformers so that it can handle new models.

aug = naw.ContextualWordEmbsAug(
----> 4 model_path=model_name, action="substitute")

3 frames
/usr/local/lib/python3.6/dist-packages/nlpaug/augmenter/word/context_word_embs.py in init(self, model_path, action, temperature, top_k, top_p, name, aug_min, aug_max, aug_p, stopwords, skip_unknown_word, device, force_reload, optimize, stopwords_regex, verbose)
91 self.model = self.get_model(
92 model_path=model_path, device=device, force_reload=force_reload, temperature=temperature, top_k=top_k,
---> 93 top_p=top_p, optimize=optimize)
94 # Override stopwords
95 if stopwords is not None and self.model_type in ['xlnet', 'roberta']:

/usr/local/lib/python3.6/dist-packages/nlpaug/augmenter/word/context_word_embs.py in get_model(cls, model_path, device, force_reload, temperature, top_k, top_p, optimize)
269 def get_model(cls, model_path, device='cuda', force_reload=False, temperature=1.0, top_k=None, top_p=0.0,
270 optimize=None):
--> 271 return init_context_word_embs_model(model_path, device, force_reload, temperature, top_k, top_p, optimize)

/usr/local/lib/python3.6/dist-packages/nlpaug/augmenter/word/context_word_embs.py in init_context_word_embs_model(model_path, device, force_reload, temperature, top_k, top_p, optimize)
28 model = nml.Roberta(model_path, device=device, temperature=temperature, top_k=top_k, top_p=top_p)
29 elif 'bert' in model_path:
---> 30 model = nml.Bert(model_path, device=device, temperature=temperature, top_k=top_k, top_p=top_p)
31 elif 'xlnet' in model_path:
32 model = nml.XlNet(model_path, device=device, temperature=temperature, top_k=top_k, top_p=top_p, optimize=optimize)

/usr/local/lib/python3.6/dist-packages/nlpaug/model/lang_models/bert.py in init(self, model_path, temperature, top_k, top_p, device)
21 self.model_path = model_path
22
---> 23 self.tokenizer = BertTokenizer.from_pretrained(model_path)
24 self.model = BertForMaskedLM.from_pretrained(model_path)
25

NameError: name 'BertTokenizer' is not defined

I tried transformers 2.5.1 version. However, it includes a bug when tokenziate words. Will further evaluate it.

Ok, thanks Edward.

I tried to use AutoModel but it seems that it changes behavior totally. Need further time for investigation.