speech-language-model

There are 7 repositories under speech-language-model topic.

  • ictnlp/LLaMA-Omni

    LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.

    Language:Python2.7k3052185
  • jishengpeng/WavTokenizer

    SOTA discrete acoustic codec models with 40 tokens per second for audio language modeling

    Language:Python936225756
  • jishengpeng/WavChat

    A Survey of Spoken Dialogue Models (60 pages)

  • zhenye234/xcodec

    AAAI 2025: Codec Does Matter: Exploring the Semantic Shortcoming of Codec for Audio Language Model

    Language:Python1218165
  • hhguo/SoCodec

    Ultra-low-bitrate Speech Codec for Speech Language Modeling Applications

    Language:Python75774
  • slp-rl/salmon

    The official code for the SALMonšŸ£ benchmark

    Language:Python43100
  • lucadellalib/audiocodecs

    A collections of audio codecs with a standardized API

    Language:Python4200