/Pic2Speech

A Neural Captioning Model trained with Keras and deployed in a Flutter App

Primary LanguageJupyter Notebook

Pic2Speech - Describing the world for visually impaired

Pic2Speech uses Artificial Intelligence to describe the content of pictures to help visually impaired understanding the world.

Model

The model is built with Keras and is mostly based on Show and Tell: A Neural Image Caption Generator" by Vinyals et al. It's trained on the Full MS COCO for around 500k steps.

Deployment

The model is deployed on an Azure ML Service using Azure ML Python API.

Mobile App

The mobile app is developed with Google Flutter and let people take a picture with their smartphone and get a vocal description for it.