/serverless-multilabel-text-classification

serverless text classification using magpie framework and aws lambda

Primary LanguagePython

Magpie serverless

Trying to deploy in a serverless fashion this framework. The serverless function works locally but still needs some tweaks to work in the cloud. The model folder contains the pretrained model for the function to work. Pull request highly appreciated.

Status

The package is to heavy for aws lambda even with some tweaks by reducing dependencies size (get rid of shared libraries etc). I've not tested on kubeless. I keep the repository up for the records

Prerequesites

  • Have node and npm installed. There is a good guide for installing node and npm on linux here
  • Have python and pip installed. There is a good guide for installing python with conda here
  • Have virtualenv installed:
 pip install virtualenv
  • Make sure you have exported your AWS keys in your environmnet variables

Install / Update serverless framework

npm install -g serverless

Getting started

git clone https://github.com/Sach97/serverless-multilabel-text-classification.git
cd serverless-multilabel-text-classification
virtualenv venv --python=python3
source venv/bin/activate
pip3 install git+https://github.com/inspirehep/magpie.git@v2.0 && pip3 install tensorflow
serverless invoke local -f predict --data '{"text":"Stephen Hawking studies black holes"}' --log # uncomment the two lines in handler.py

Build the code dependcy package yourself + train model

git clone https://github.com/Sach97/serverless-multilabel-text-classification.git
cd serverless-multilabel-text-classification
chmod +x build_vendored.sh
chmod +x clean_venv.sh
chmod +x env_var.sh
virtualenv venv --python=python3
source venv/bin/activate
pip3 install git+https://github.com/inspirehep/magpie.git@v2.0 && pip3 install tensorflow
sh clean_venv.sh
sh build_vendored.sh
python train_model.py
python upload_model.py
serverless invoke local -f predict --data '{"text":"Stephen Hawking studies black holes"}' --log 

Run locally

serverless invoke local -f predict --data '{"text":"Stephen Hawking studies black holes"}' --log

TODOs

  • Create a real function not just an import magpie
  • Make a better shell script for the zip
  • Resolve the issue
  • Add CircleCI continuous integration badge and an explanation guide.
  • Add an AWS deployment button
  • Load the model globally before a lambda function get called, like in this repo