A Deep Learning NLP/NLU library by Intel® AI Lab
Overview | Models | Installation | Examples | Documentation | Tutorials | Contributing
NLP Architect is an open source Python library for exploring state-of-the-art deep learning topologies and techniques for Natural Language Processing and Natural Language Understanding. NLP Architect's main purpose is to provide easy usage of NLP and NLU models while providing state-of-art and robust implementation.
NLP Architect is an NLP library designed to be flexible, easy to extend, allow for easy and rapid integration of NLP models in applications and to showcase optimized models.
Features:
- Core NLP models used in many NLP tasks and useful in many NLP applications
- Novel NLU models showcasing novel topologies and techniques
- Simple REST API server (doc):
- serving trained models (for inference)
- plug-in system for adding your own model
- 4 Demos of models (pre-trained by us) showcasing NLP Architect (Dependency parser, NER, Intent Extraction, Q&A)
- Based on optimized Deep Learning frameworks:
- Documentation website and tutorials
- Essential utilities for working with NLP models - Text/String pre-processing, IO, data-manipulation, metrics, embeddings.
We recommend to install NLP Architect in a new python environment, to use python 3.6+ with up-to-date pip
, setuptools
and h5py
.
Includes core library and all content (example scripts, datasets, tutorials)
Clone repository
git clone https://github.com/NervanaSystems/nlp-architect.git
cd nlp-architect
Install (in develop mode)
pip install -e .
Includes only core library
pip install nlp-architect
Refer to our full installation instructions page on our website for complete details on how to install NLP Architect and other backend installations such as MKL-DNN or GPU backends.
NLP models that provide best (or near) in class performance:
- Word chunking
- Named Entity Recognition
- Dependency parsing
- Intent Extraction
- Sentiment classification
- Language models
Natural Language Understanding (NLU) models that address semantic understanding:
- Aspect Based Sentiment Analysis (ABSA)
- Noun phrase embedding representation (NP2Vec)
- Most common word sense detection
- Relation identification
- Cross document coreference
- Noun phrase semantic segmentation
Components instrumental for conversational AI:
End-to-end Deep Learning-based NLP models:
- Reading comprehension
- Sparse and Quantized Neural Machine Translation (GNMT)
- Language Modeling using Temporal Convolution Network (TCN)
- Unsupervised Cross-lingual embeddings
Solutions (End-to-end applications) using one or more models:
- Term Set expansion - uses the included word chunker as a noun phrase extractor and NP2Vec to create semantic term sets
- Topics and trend analysis - analyzing trending phrases in temporal corpora
- Aspect Based Sentiment Analysis (ABSA)
Full library documentation of NLP models, algorithms, solutions and instructions on how to run each model can be found on our website.
NLP Architect aspires to enable quick development of state-of-art NLP/NLU algorithms and to showcase Intel AI's efforts in deep-learning software optimization (Tensorflow MKL-DNN, etc.) The library is designed around the life cycle of model development - pre-process, build model, train, validate, infer, save or deploy.
The main design guidelines are:
- Deep Learning framework agnostic
- Develop topologies utilized in NLP models
- NLP/NLU models implementation using included topologies
- Showcase End-to-End applications (Solutions) utilizing one or more NLP Architect model
- Generic dataset loaders, textual data processing utilities, and miscellaneous utilities that support NLP model development (loaders, text processors, io, metrics, etc.)
- Pythonic API for training and inference
- REST API servers with ability to serve trained models via HTTP
- Extensive model documentation and tutorials
Dependency parser
Intent Extraction
Package | Description |
---|---|
nlp_architect.api |
Model server API interfaces |
nlp_architect.common |
Common packages |
nlp_architect.contrib |
Framework extensions |
nlp_architect.data |
Datasets, data loaders and data classes |
nlp_architect.models |
NLP, NLU and End-to-End neural models |
nlp_architect.pipelines |
End-to-end NLP apps |
nlp_architect.server |
API Server and demos UI |
nlp_architect.solutions |
Solution applications |
nlp_architect.utils |
Misc. I/O, metric, pre-processing and text utilities |
NLP Architect is an active space of research and development; Throughout future releases new models, solutions, topologies and framework additions and changes will be made. We aim to make sure all models run with Python 3.6+. We encourage researchers and developers to contribute their work into the library.
If you use NLP Architect in your research, please use the following citation:
@misc{izsak_peter_2018_1477518,
author = {Izsak, Peter and
Bethke, Anna and
Korat, Daniel and
Yaccobi, Amit and
Mamou, Jonathan and
Guskin, Shira and
Nittur Sridhar, Sharath and
Keller, Andy and
Pereg, Oren and
Eirew, Alon and
Tsabari, Sapir and
Green, Yael and
Kothapalli, Chinnikrishna and
Eavani, Harini and
Wasserblat, Moshe and
Liu, Yinyin and
Boudoukh, Guy and
Zafrir, Ofir and
Tewani, Maneesh},
title = {NLP Architect by Intel AI Lab},
month = nov,
year = 2018,
doi = {10.5281/zenodo.1477518},
url = {https://doi.org/10.5281/zenodo.1477518}
}
The NLP Architect is released as reference code for research purposes. It is not an official Intel product, and the level of quality and support may not be as expected from an official product. NLP Architect is intended to be used locally and has not been designed, developed or evaluated for production usage or web-deployment. Additional algorithms and environments are planned to be added to the framework. Feedback and contributions from the open source and NLP research communities are more than welcome.
Contact the NLP Architect development team through Github issues or email: nlp_architect@intel.com