cisco/mindmeld

Improve error messages for installing extra [bert] when using a QuestionAnswerer

murali1996 opened this issue · 3 comments

When a user tries to create a QuestionAnswerer without installing the extra [bert], the error message raised is ValueError: Invalid model configuration: Unknown embedder type 'bert'. This message is misleading as the user's input config might be correct but the bert requirements are not installed. Following is the example snippet that produces the error even when inputs are correct but the requirements are not satisfied:

from mindmeld import QuestionAnswerer

QUESTION_ANSWERER_CONFIG = {
   "model_type": "elasticsearch", 
   "model_settings": {
       "query_type": "embedder_text",
       "embedder_type": "bert",
       "pretrained_name_or_abspath": "bert-base-cased", 
       "bert_output_type": "mean",
       "quantize_model": False, 
       "embedding_fields": {
           # ...
       }
   }
}

qa = QuestionAnswerer(app_path="./blueprints/hr_assistant", config=QUESTION_ANSWERER_CONFIG)

Possible solution(s):

  • One way: Create a dict that identifies what extras are required for each key in EMBEDDER_MAP in helpers.py file. Check for those installs in create_embedder_model method.
  • Another way: Create a EmbedderModelFactory that holds such kind of validation checks before assigning the appropriate class-name to use. For backwards compatibility, keep supporting the method create_embedder_model but going forward, use something like EmbedderModelFactory.create_model(config=...)

Could we do something analogous to this:

        GoogleTranslator._check_credential_exists()
        try:
            translate_v2 = importlib.import_module("google.cloud.translate_v2")
            return translate_v2.Client()
        except ModuleNotFoundError as error:
            raise ModuleNotFoundError(
                "Library not found: 'google-cloud'. Run 'pip install mindmeld[language_annotator]'"
                " to install."
            ) from error

Yeah. I'll add a solution like you suggested in the create_embedder_model method in componenets/helpers.py by caching the ImportError from the embedder_models.py classes. Will reflect this modification in PR #341