Zth9730

University of Science and Technology Beijing

Computer of Science and Technology Beijing

Zth9730's Stars

2noise/ChatTTS
A generative speech model for daily dialogue.
Language:Python33.5k 191 5853.6k
meta-llama/llama3
The official Meta Llama 3 GitHub site
Language:Python27.9k 234 2743.2k
NVIDIA/Megatron-LM
Ongoing research training transformer models at scale
Language:Python11.1k 167 8172.5k
kyutai-labs/moshi
Language:Python7.1k 80 96556
InternLM/xtuner
An efficient, flexible and full-featured toolkit for fine-tuning LLM (InternLM2, Llama3, Phi3, Qwen, Mistral, ...)
Language:Python4.1k 36 549328
FunAudioLLM/SenseVoice
Multilingual Voice Understanding Model
Language:Python4k 44 162355
qjfoidnh/BaiduPCS-Go
iikira/BaiduPCS-Go原版基础上集成了分享链接/秒传链接转存功能
Language:Go3.1k 28 322469
google-research/big_vision
Official codebase used to develop Vision Transformer, SigLIP, MLP-Mixer, LiT and more.
Language:Jupyter Notebook2.5k 39 61163
microsoft/Megatron-DeepSpeed
Ongoing research training transformer language models at scale, including: BERT & GPT-2
Language:Python1.9k 25 184345
fixie-ai/ultravox
A fast multimodal LLM for real-time voice
Language:Python1.7k 33 54125
VITA-MLLM/VITA
✨✨VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction
Language:Python1.7k 38 67120
bigscience-workshop/Megatron-DeepSpeed
Ongoing research training transformer language models at scale, including: BERT & GPT-2
Language:Python1.4k 24 144221
yyyujintang/Awesome-Mamba-Papers
Awesome Papers related to Mamba.
1.3k 26 1866
multimodal-art-projection/MAP-NEO
Language:Python900 11 3484
X-LANCE/SLAM-LLM
Speech, Language, Audio, Music Processing with Large Language Model
Language:Python637 22 4858
yangdongchao/AcademiCodec
AcademiCodec: An Open Source Audio Codec Model for Academic Research
Language:Python607 31 4080
liguodongiot/llm-resource
LLM全栈优质资源汇总
Language:Shell447 7 048
facebookresearch/speech-resynthesis
An official reimplementation of the method described in the INTERSPEECH 2021 paper - Speech Resynthesis from Discrete Disentangled Self-Supervised Representations.
Language:Python396 19 2057
FacePerceiver/FaRL
FaRL for Facial Representation Learning [Official, CVPR 2022]
Language:Python391 8 2522
showlab/videollm-online
VideoLLM-online: Online Video Large Language Model for Streaming Video (CVPR 2024)
Language:Python278 8 5032
voidful/Codec-SUPERB
Audio Codec Speech processing Universal PERformance Benchmark
Language:Python236 12 2022
xingchensong/S3Tokenizer
Reverse Engineering of Supervised Semantic Speech Tokenizer (S3Tokenizer) proposed in CosyVoice
Language:Python225 10 1330
mct10/RepCodec
Models and code for RepCodec: A Speech Representation Codec for Speech Tokenization
Language:Python164 14 911
Takaaki-Saeki/DiscreteSpeechMetrics
Reference-aware automatic speech evaluation toolkit
Language:Python137 6 210
ylacombe/finetune-hf-vits
Finetune VITS and MMS using HuggingFace's tools
Language:Python129 5 4736
young-geng/scalax
A simple library for scaling up JAX programs
Language:Python128 8 110
yangdongchao/RSTnet
Real-time Speech-Text Foundation Model Toolkit (wip)
Language:Python126 11 411
asappresearch/wav2seq
Official code for Wav2Seq
Language:Python96 4 112
XiaoMi/dasheng
Official PyTorch code for Deep Audio-Signal Holistic Embeddings
Language:Python63 6 47
my-yy/vfal_papers
Voice Face Association Learning Paper List
14 1 11

Zth9730

Zth9730's Stars

2noise/ChatTTS

meta-llama/llama3

NVIDIA/Megatron-LM

kyutai-labs/moshi

InternLM/xtuner

FunAudioLLM/SenseVoice

qjfoidnh/BaiduPCS-Go

google-research/big_vision

microsoft/Megatron-DeepSpeed

fixie-ai/ultravox

VITA-MLLM/VITA

bigscience-workshop/Megatron-DeepSpeed

yyyujintang/Awesome-Mamba-Papers

multimodal-art-projection/MAP-NEO

X-LANCE/SLAM-LLM

yangdongchao/AcademiCodec

liguodongiot/llm-resource

facebookresearch/speech-resynthesis

FacePerceiver/FaRL

showlab/videollm-online

voidful/Codec-SUPERB

xingchensong/S3Tokenizer

mct10/RepCodec

Takaaki-Saeki/DiscreteSpeechMetrics

ylacombe/finetune-hf-vits

young-geng/scalax

yangdongchao/RSTnet

asappresearch/wav2seq

XiaoMi/dasheng

my-yy/vfal_papers