Visual Chatbot

Introduction

Demo for the paper

Visual Dialog
Abhishek Das, Satwik Kottur, Khushi Gupta, Avi Singh, Deshraj Yadav, José M. F. Moura, Devi Parikh, Dhruv Batra
arxiv.org/abs/1611.08669
CVPR 2017 (Spotlight)

Live demo: http://visualchatbot.cloudcv.org

Visual Dialog requires an AI agent to hold a meaningful dialog with humans in natural, conversational language about visual content. Given an image, dialog history, and a follow-up question about the image, the AI agent has to answer the question. Putting it all together, we demonstrate the first ‘visual chatbot’!

Visual Chatbot

Installation Instructions

Installing the Essential requirements

sudo apt-get install -y git python-pip python-dev
sudo apt-get install -y python-dev
sudo apt-get install -y autoconf automake libtool curl make g++ unzip
sudo apt-get install -y libgflags-dev libgoogle-glog-dev liblmdb-dev
sudo apt-get install libprotobuf-dev libleveldb-dev libsnappy-dev libopencv-dev libhdf5-serial-dev protobuf-compiler

Install Torch

git clone https://github.com/torch/distro.git ~/torch --recursive
cd ~/torch; bash install-deps;
./install.sh
source ~/.bashrc

Install PyTorch(Python Lua Wrapper)

git clone https://github.com/hughperkins/pytorch.git
cd pytorch
source ~/torch/install/bin/torch-activate
./build.sh

Install RabbitMQ and Redis Server

sudo apt-get install -y redis-server rabbitmq-server
sudo rabbitmq-plugins enable rabbitmq_management
sudo service rabbitmq-server restart 
sudo service redis-server restart

Lua dependencies

luarocks install loadcaffe

The below two dependencies are only required if you are going to use GPU

luarocks install cudnn
luarocks install cunn

Cuda Installation

Note: CUDA and cuDNN is only required if you are going to use GPU

Download and install CUDA and cuDNN from nvidia website

Install dependencies

git clone https://github.com/Cloud-CV/visual-chatbot.git
cd visual-chatbot
git submodule init && git submodule update
sh models/download_models.sh
pip install -r requirements.txt

Change lines 2-4 of neuraltalk2/misc/LanguageModel.lua to the following:

local utils = require 'neuraltalk2.misc.utils'
local net_utils = require 'neuraltalk2.misc.net_utils'
local LSTM = require 'neuraltalk2.misc.LSTM'

Running the RabbitMQ workers and Development Server

Open 2 different terminal sessions and run the following commands:

python worker.py
python worker_captioning.py
python manage.py runserver

You are all set now. Visit http://127.0.0.1:8000 and you will have your demo running successfully.

Cite this work

If you find this code useful, consider citing our work:

@inproceedings{visdial,
  title={{V}isual {D}ialog},
  author={Abhishek Das and Satwik Kottur and Khushi Gupta and Avi Singh
    and Deshraj Yadav and Jos\'e M.F. Moura and Devi Parikh and Dhruv Batra},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  year={2017}
}

Contributors

License

BSD