Aida

Your application can understand natural language in house.
Use open source AI models that can train from the browser using javascript or python and can run everywhere.

Demo | Train your own assistant | Technical overview

Getting started

Aida is a library that helps you build conversational user experiences with this concepts in mind:

Universal application: The trained models should be able to run anywhere, that is why the models have two mirror implementations: in TensorflowJS to be able to train and run from browsers or nodejs, and in Keras to run in python and export to mobile devices (CoreML for iOS and TensorFlow for Android).
Offline support: It should be able to train and make predictions without connectivty, no need to have a server-side api, although the trained models can also run server-side behind an api if desired.
Low memory consumption: Having small file size and memory consumption is very important if we want to run from browsers. Most NLU models use huge dictionaries (several gigabytes size) like word2vec, to solve this problem, we are only using pre-trained fastText bigram embeddings, this keeps the dictionary very small, fast to download.
Accurate: Carefully crafted, close to state of the art neural network models for text classification and named entity recognition, the models will only get better as the field progresses and the community expands.
Easy to use: Getting started by creating a dataset and training couldn't be easier thanks to Chatito , you can create a large dataset in minutes, and start training without any setup, just from the browser.

Check the demo

It's a chatbott running from the browser using Tensorflow.js and the Web Speech API to interact with voice too.

Train from the browser

You can train from the browser using Javascript and Tensorflow.js (using your local GPU resources) or from the browser using Python and Keras thanks to Google Collab's free TPU's. There is no need to setup a local environment to start training your own conversational assistant.

Local setup and training

Alternatively to thhe online training experience. You can also setup a local environment.

Clone the GH proejct and install dependencies:
- Run npm install from the ./typescript directory
- Run pip3 install -r requirements.txt from the ./python directory
Edit or create the chatito files inside ./typescript/examples/en/intents to customize the dataset generation as you need.
From ./typescript run npm run dataset:en:process. This will generate many files at the ./typescript/public/models directory. The dataset, the dataset parameters, the testing parameters and the embeddings dictionary. You can further inspect those generated files to make sence of their content. (Note: Aida also supports spanish language, if you need other language you can add if you first download the fastText embeddings for that language).
You can start training from 3 local environments:
- From python: just open ./python/main.ipynb with jupyter notebook or jupyter lab. Python will load your custom settings generated at step 3. After running the notebook, convert the models to tensorflow.js web format running npm run python:convert from the ./typescript directory.
- From web browsers: from ./typescript run npm run web:start. Then navigate to http://localhost:8000/train for the training web UI. After training, downloading the model to the ./typescript/public/pretrained/web directory.
- From node.js: from ./typescript run npm run node:start.
NOTE: After training (and converting for python), the models should be available at ./typescript/public/pretrained with a custom directory for each platform:

Technical Overview

Read the technical overview documentation.

TODO

Universal Language Model Fine-tuning for Text Classification (blog post)- ULMFiT is the current state of the art for NLP tasks.

Donate

Designing and maintaining aida takes time and effort, if it was usefull for you, please consider making a donation and share the abundance. Become a Patron!

Author and maintainer

Rodrigo Pimentel

License

The code is open sourced under the BSD-3-Clause license, please contact me if you want to use the code under a less restrictive license.

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

flaredragon/aida