/Image-Captioining

The objective is to process by generating textual description from an image – based on the objects and actions in the image. Using generative models so that it creates novel sentences. Pipeline type models uses two separate learning process, one for language modelling and other for image recognition. It first identifies objects in image and provides the result to the Inception-v3 model to convert into word embedding vector than into series of LSTM cells to get desired captions.

Primary LanguagePythonMIT LicenseMIT

Stargazers