kyutai

Kyutai - Open Science AI Lab

France

Pinned Repositories

dactory
Language:Python435
delayed-streams-modeling
Kyutai's Speech-To-Text and Text-To-Speech models based on the Delayed Streams Modeling framework.
Language:Python2.6k 25 88259
hibiki
Hibiki is a model for streaming speech translation (also known as simultaneous translation). Unlike offline translation—where one waits for the end of the source utterance to start translating--- Hibiki adapts its flow to accumulate just enough context to produce a correct translation in real-time, chunk by chunk.
Language:Rust1.3k 21 11103
moshi
Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audio codec.
Language:Python9.1k 97 148827
moshi-finetune
Language:Python32237
moshi-swift
Language:Swift122 4 210
moshivis
Kyutai with an "eye"
Language:Python223 7 228
nanoGPTaudio
Code for the blog "Neural audio codecs: how to get audio into LLMs"
Language:Python1283
sphn
python bindings for symphonia/opus - read various audio formats from python and write opus files
Language:Rust70 3 97
unmute
Make text LLMs listen and speak
Language:Python962 12 55170

kyutai's Repositories

kyutai-labs/moshi
Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audio codec.
Language:Python9.1k 97 148827
kyutai-labs/delayed-streams-modeling
Kyutai's Speech-To-Text and Text-To-Speech models based on the Delayed Streams Modeling framework.
Language:Python2.6k 25 88259
kyutai-labs/hibiki
Hibiki is a model for streaming speech translation (also known as simultaneous translation). Unlike offline translation—where one waits for the end of the source utterance to start translating--- Hibiki adapts its flow to accumulate just enough context to produce a correct translation in real-time, chunk by chunk.
Language:Rust1.3k 21 11103
kyutai-labs/unmute
Make text LLMs listen and speak
Language:Python962 12 55170
kyutai-labs/moshi-finetune
Language:Python32237
kyutai-labs/moshivis
Kyutai with an "eye"
Language:Python223 7 228
kyutai-labs/nanoGPTaudio
Code for the blog "Neural audio codecs: how to get audio into LLMs"
Language:Python1283
kyutai-labs/moshi-swift
Language:Swift122 4 210
kyutai-labs/sphn
python bindings for symphonia/opus - read various audio formats from python and write opus files
Language:Rust70 3 97
kyutai-labs/dactory
Language:Python435
kyutai-labs/yomikomi
A small rust-based data loader
Language:Rust31 3 02
kyutai-labs/kaudio
Rust crate for some audio utilities
Language:Rust251
kyutai-labs/ARC-Encoder
Language:Python203
kyutai-labs/moshi-webrtc
Proof of concept for running moshi/hibiki using webrtc
Language:Rust192
kyutai-labs/tts_longeval
Language:Python171
kyutai-labs/jax-flash-attn3
JAX bindings for the flash-attention3 kernels
Language:C++16 2 03
kyutai-labs/jax-flash-attn2
JAX bindings for the flash-attention2 kernels
Language:C++9 1 00
kyutai-labs/ogg-table
Ogg-vorbis reader with fast random access
Language:Rust6 3 01
kyutai-labs/dora
Dora is an experiment management framework. It expresses grid searches as pure python files as part of your repo. It identifies experiments with a unique hash signature. Scale up to hundreds of experiments without losing your sanity.
Language:Python4
kyutai-labs/flashy
Framework for writing deep learning training loops. Lightweight, and retaining full freedom to design as you see fits. It handles checkpointing, logging, distributed, compatibility with Dora, and more!
Language:Python3
kyutai-labs/neural-audio-codecs-anims
Animations for the blog "Neural audio codecs: how to get audio into LLMs"
Language:TypeScript

kyutai

Pinned Repositories

dactory

delayed-streams-modeling

hibiki

moshi

moshi-finetune

moshi-swift

moshivis

nanoGPTaudio

sphn

unmute

kyutai's Repositories

kyutai-labs/moshi

kyutai-labs/delayed-streams-modeling

kyutai-labs/hibiki

kyutai-labs/unmute

kyutai-labs/moshi-finetune

kyutai-labs/moshivis

kyutai-labs/nanoGPTaudio

kyutai-labs/moshi-swift

kyutai-labs/sphn

kyutai-labs/dactory

kyutai-labs/yomikomi

kyutai-labs/kaudio

kyutai-labs/ARC-Encoder

kyutai-labs/moshi-webrtc

kyutai-labs/tts_longeval

kyutai-labs/jax-flash-attn3

kyutai-labs/jax-flash-attn2

kyutai-labs/ogg-table

kyutai-labs/dora

kyutai-labs/flashy

kyutai-labs/neural-audio-codecs-anims