lunar333

lunar333's Stars

QwenLM/Qwen2-Audio
The official repo of Qwen2-Audio chat & pretrained large audio language model proposed by Alibaba Cloud.
Language:Python1.1k64
FunAudioLLM/CosyVoice
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
Language:Python4.9k494
2noise/ChatTTS
A generative speech model for daily dialogue.
Language:Python30.8k3.3k
VITA-MLLM/VITA
✨✨VITA: Towards Open-Source Interactive Omni Multimodal LLM
Language:Python76739
modelscope/ms-swift
Use PEFT or Full-parameter to finetune 350+ LLMs or 90+ MLLMs. (Qwen2.5, GLM4v, Internlm2.5, Yi, Llama3.1, Llava-Video, Internvl2, MiniCPM-V-2.6, Deepseek, Baichuan2, Gemma2, Phi3-Vision, ...)
Language:Python3.5k298
LinkSoul-AI/LLaSM
第一个支持中英文双语语音-文本多模态对话的开源可商用对话模型。便捷的语音输入将大幅改善以文本为输入的大模型的使用体验，同时避免了基于 ASR 解决方案的繁琐流程以及可能引入的错误。
Language:Python52154
echonoshy/cgft-llm
Practice to LLM.
Language:Jupyter Notebook30158
hiyouga/LLaMA-Factory
Efficiently Fine-Tune 100+ LLMs in WebUI (ACL 2024)
Language:Python31.1k3.8k
QwenLM/Qwen2.5
Qwen2.5 is the large language model series developed by Qwen team, Alibaba Cloud.
Language:Shell7.9k479
ddlBoJack/emotion2vec
[ACL 2024] Official PyTorch code for extracting features and training downstream models with emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation
Language:Python58142
UnicomAI/Unichat-llama3-Chinese
Language:Python34134
traceless/alist-encrypt
这个项目主要是对 alist 的服务进行代理，提供 webdav 的加解密功能。支持 alist 网页在线播放加密的视频，查看加密的图片等功能，同时在 webdav 下的操作透明，自动实现文件资源的加解密。
Language:JavaScript1.3k120
Kedreamix/Linly-Talker
Digital Avatar Conversational System - Linly-Talker. 😄✨ Linly-Talker is an intelligent AI system that combines large language models (LLMs) with visual models to create a novel human-AI interaction method. 🤝🤖 It integrates various technologies like Whisper, Linly, Microsoft Speech Services, and SadTalker talking head generation system. 🌟🔬
Language:Python1.8k297
RVC-Boss/GPT-SoVITS
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
Language:Python32.8k3.8k
jik876/hifi-gan
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
Language:Python1.9k502
f/awesome-chatgpt-prompts
This repo includes ChatGPT prompt curation to use ChatGPT better.
Language:HTML111k15.1k
JushBJJ/Mr.-Ranedeer-AI-Tutor
A GPT-4 AI Tutor Prompt for customizable personalized learning experiences.
28.5k3.3k
neulab/BARTScore
BARTScore: Evaluating Generated Text as Text Generation
Language:Python31538
fishaudio/Bert-VITS2
vits2 backbone with multilingual-bert
Language:Python7.8k1.1k
Nekomoekissaten-SUB/Nekomoekissaten-Subs
Subtitle source files from Nekomoe Kissaten. Should there be any issues, please create them in this main repository first.
2.1k57
RVC-Project/Retrieval-based-Voice-Conversion-WebUI
Easily train a good VC model with voice data <= 10 mins!
Language:Python23k3.4k
PierreColombo/nlg_eval_via_simi_measures
NLG evaluation via Statistical Measures of Similarity: BaryScore, DepthScore, InfoLM
Language:Jupyter Notebook417
THUDM/ChatGLM3
ChatGLM3 series: Open Bilingual Chat LLMs | 开源双语对话语言模型
Language:Python13.3k1.6k
0nutation/SpeechGPT
SpeechGPT Series: Speech Large Language Models
Language:Python1.2k80
lunar333/vits-japanese-finetune
Language:Python16
AlexandaJerry/vits-mandarin-biaobei
application of vits on mandarin tts
Language:Jupyter Notebook120104
snakers4/silero-vad
Silero VAD: pre-trained enterprise-grade Voice Activity Detector
Language:Python4k395
bmaltais/kohya_ss
Language:Python9.4k1.2k
pyannote/pyannote-audio
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
Language:Jupyter Notebook5.9k755
Mastering-Python-GT/Transcription-diarization-whisper-pyannote
Transcription and diarization (speaker identification)
Language:Python2611