mini-omni

Pinned Repositories

mini-omni2
Towards Open-source GPT-4o with Vision, Speech and Duplex Capabilities。
Language:Python10
CLIP
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
Language:Jupyter Notebook00
mini-omni
open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.
Language:Python3.2k278
mini-omni2
Towards Open-source GPT-4o with Vision, Speech and Duplex Capabilities。
Language:Python1.7k192

mini-omni/CLIP
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
Language:Jupyter Notebook
mini-omni/mini-omni2
Towards Open-source GPT-4o with Vision, Speech and Duplex Capabilities。