cwx-worst-one
Undergraduate Student @ SJTU. Research Intern @microsoft. Interested in understanding & generation in speech and audio.
Shanghai Jiao Tong UniversityBeijing
Pinned Repositories
cwx-worst-one
My personal repository
cwx-worst-one.github.io
EAT
[IJCAI 2024] EAT: Self-Supervised Pre-Training with Efficient Audio Transformer
Freeze-Omni-test
✨✨Freeze-Omni: A Smart and Low Latency Speech-to-speech Dialogue Model with Frozen LLM
GLM-4-Voice-test
GLM-4-Voice 简化本地单轮&多轮推理
LLaMA-Factory
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
LLaMA-Omni-test
LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.
mini-omni2-test
Towards Open-source GPT-4o with Vision, Speech and Duplex Capabilities。
SLAM-LLM
Speech, Language, Audio, Music Processing with Large Language Model
SLAM-LLM
Speech, Language, Audio, Music Processing with Large Language Model
cwx-worst-one's Repositories
cwx-worst-one/EAT
[IJCAI 2024] EAT: Self-Supervised Pre-Training with Efficient Audio Transformer
cwx-worst-one/mini-omni2-test
Towards Open-source GPT-4o with Vision, Speech and Duplex Capabilities。
cwx-worst-one/SLAM-LLM
Speech, Language, Audio, Music Processing with Large Language Model
cwx-worst-one/cwx-worst-one
My personal repository
cwx-worst-one/cwx-worst-one.github.io
cwx-worst-one/Freeze-Omni-test
✨✨Freeze-Omni: A Smart and Low Latency Speech-to-speech Dialogue Model with Frozen LLM
cwx-worst-one/GLM-4-Voice-test
GLM-4-Voice 简化本地单轮&多轮推理
cwx-worst-one/LLaMA-Factory
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
cwx-worst-one/LLaMA-Omni-test
LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.
cwx-worst-one/mini-omni-test
open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.