/ner_fastapi

Named Entity Recognition App on FastAPI

Primary LanguagePython

NER using spaCy and FastAPI

Basic Named Entity Extractor using spaCy and FastAPI

Description

Project to understand how to deploy a basic ML model into production using Docker and AWS ECR, and EC2 services as well as learning how to use the FastAPI library. Named Entity Extractor uses pre-trained spaCy english language model supplemented with regex patterns and an entity dictionary.

Getting Started

Config file is used to manage custom entity lists and regex patterns.

export CONFIG_FILE_PATH="/path/to/your/config.json"

One can also define the the labeled entities the model will extract by defining a list of allowed labels.

{
  "entity_dicts": [
      {"label": "ORG", "pattern": [{"LOWER": "hamas"}]},
      {"label": "ORG", "pattern": [{"LOWER": "hizballah"}]}
      {"label": "ORG", "pattern": [{"LOWER": "isis"}]}
    ],
   
    "allowed_labels": ["GPE",
                        "FAC",
                        "PERSON",
                        "ORG",
                        "PRODUCT",
                        "LOC",
                        "EVENT",
                        "LAW"]
  }

Routes

Method URL
POST /entities
GET /form
POST /form

Testing

I performed testing on the english small, medium, and large models in a separate notebook using manuall tagged entities from different text to generate my truth table. Results of the mode comparisons suggest that the medium language model performed best overall in terms pf precision/recall and F1 scores.

Further Reading

Helpful links on integrating spaCy and FastAPI, dockerizing your app, pushing docker images into AWS's ECR and pulling the image to deploy on EC2.