Named Entity Recognition (NER) is a critical task in natural language processing (NLP) that involves identifying and classifying entities within text, such as names of people, organizations, locations, dates, and more. This README provides an overview of using TensorFlow, a popular open-source machine learning framework, for developing NER models. NER with TensorFlow is essential for automating the extraction of structured information from unstructured text, making it a valuable tool for various NLP applications. In this guide, we will explore how to leverage TensorFlow's capabilities to create robust NER models and adapt them to specific domain requirements.
The project directory is organized as follows:
Makefile
: Contains Makefile commands for project setup and data retrieval.models/
: Directory where pre-trained models are stored.BiLSTM.h5
: A pre-trained BiLSTM model.
notebooks/
: Jupyter notebooks for exploring and using the project.01_BiLSTM_model.ipynb
: A Jupyter notebook with a BiLSTM model.
poetry.lock
: Lock file generated by Poetry for package management.pyproject.toml
: Poetry configuration file for managing project dependencies.README.md
: This file providing an overview of the project.src/
: Source code directory containing project code.crflayer.py
: Code for the CRF layer.get_data.sh
: Script for getting data (note that there is a typo in the Makefile command).__init__.py
: Python package initializer.nermodel.py
: Code related to NER model implementation.
Create env using poetry and set-up
make init
Download the data and save inside a data folder
make get_data
This project can be used to study NER (Named Entity Recognition) using TensorFlow
Inside a models folder, the BiLSTM model will be saved for future deployments or inferecens.