caption-generation
There are 97 repositories under caption-generation topic.
aimagelab/meshed-memory-transformer
Meshed-Memory Transformer for Image Captioning. CVPR 2020
dabasajay/Image-Caption-Generator
A neural network to generate captions for an image using CNN and RNN with BEAM Search.
aimagelab/show-control-and-tell
Show, Control and Tell: A Framework for Generating Controllable and Grounded Captions. CVPR 2019
daveredrum/Scan2Cap
[CVPR 2021] Scan2Cap: Context-aware Dense Captioning in RGB-D Scans
OpenShapeLab/ShapeGPT
ShapeGPT: 3D Shape Generation with A Unified Multi-modal Language Model, a unified and user-friendly shape-language model
chenxinpeng/ARNet
CVPR 2018 - Regularizing RNNs for Caption Generation by Reconstructing The Past with The Present
ch3cook-fdu/Vote2Cap-DETR
[CVPR 2023] Vote2Cap-DETR and [T-PAMI 2024] Vote2Cap-DETR++; A set-to-set perspective towards 3D Dense Captioning; State-of-the-Art 3D Dense Captioning methods
daveredrum/D3Net
[ECCV2022] D3Net: A Unified Speaker-Listener Architecture for 3D Dense Captioning and Visual Grounding
tanishqgautam/Image-Captioning
Implemented 3 different architectures to tackle the Image Caption problem, i.e, Merged Encoder-Decoder - Bahdanau Attention - Transformers
damminhtien/deep-learning-image-caption-generator
Deep CNN-LSTM for Generating Image Descriptions :smiling_imp:
aimagelab/speaksee
PyTorch library for Visual-Semantic tasks
nalbert9/Image-Captioning
Computer Vision: Generate captions that describe the contents of images using PyTorch
rahulsonone1234/Traffic-Sign-Recognition
To ease the driver to identify the Traffic Signs and also for the efficient working of Self-Driving Cars.
heng-hw/SpaCap3D
[IJCAI 2022] Spatiality-guided Transformer for 3D Dense Captioning on Point Clouds (official pytorch implementation)
nirajankarki5/Flickr30k-Image-Caption-Generator-Using-Deep-Learning
A deep learning model that generates descriptions of an image.
aimagelab/DiCO
[BMVC 2024 Oral ✨] Revisiting Image Captioning Training Paradigm via Direct CLIP-based Optimization
abachaa/3D-MIR
3D Medical Image Retrieval in Radiology
apivideo/caption.new
Sample app to add captions to an uploaded video. From api.video (https://api.video)
ghostofpokemon/oCaption
oCaption: Leveraging OpenAI's GPT-4 Vision for Advanced Image Captioning
pier-maker92/stable-diffusion-experiments
This is a repo providing same stable diffusion experiments, regarding textual inversion task and captioning task
pritishmishra703/Image-Captioning
Image-to-Text
juletx/image-caption-generation
Automatic Image Caption Generation model that uses a CNN to condition a LSTM based language model
ApoorvGit/god-s-eye
Aid for blinds. This AI will describe the surrounding, it will tell who is in front of him (if that person is a known person to AI using Facial Recognition) and it will also help him to know what is written (Optical Character Recognition)
ihaeyong/drama-graph
Drama-Graph repository produces both knowledge base on drama scripts and video graph for Video Turing Test (VTT).
abdullahzia510/Effecient-Urdu-Caption-Generation-using-Attention-Mechanism
This repository contains code and results for the Course Project by Deep Learning Spring 2020 course offered at Information Technology University, Lahore, Pakistan. This repository is only for learning purposes and is not intended to be used for commercial purposes.
lachhabw/Image-Captioning-Extension-for-LM-Studio
LM Studio extension for automatic image captioning.
mahendranandi/Image_Captioning
Image captioning using ResNet50 and LSTM in keras library. An application of both CV (Computer Vision) and NLP(Natural Language Processing) concepts.
LaurentVeyssier/Image-Captioning-Project-with-full-Encoder-Decoder-model
Generate caption on images using CNN Encoder- LSTM Decoder structure
oshtz/tagmeister-pc
Efficient image captioning using OpenAI API
imanom/Generating-Subtitles
Generates subtitles from a video/audio file. Developed in Python and uses Google Cloud APIs.
sabirdvd/BLIP_image_caption_demo
BLIP image caption demo - medium post blog
yash-sarwaswa/Image-Caption-Generator
Fabricating a Python application that generates a caption for a selected image. Involves the use of Deep Learning and NLP Frameworks in Tensorflow, Keras and NLTK modules for data processing and creation of deep learning models and their evaluation.
Imiloin/Capoom
A real-time subtitle generator, based on whisper.
leeyunjai/image2text
caption generator using lavis and argostranslate
shunk031/huggingface-datasets_MSCOCO
Microsoft COCO: Common Objects in Context for huggingface datasets
Vinventive/live-captions-vr
Accessibility-focused SteamVR Overlay improving communication between deaf, hard-of-hearing, and hearing users in VR. It is leveraging AI allowing users to see real-time speech transcription in their 3D space. DISCLAIMER: Voice recognition technology is prone to errors and project should not be used as a replacement for medical hearing aid.