wbgxx333's Stars
f/awesome-chatgpt-prompts
This repo includes ChatGPT prompt curation to use ChatGPT better.
myshell-ai/OpenVoice
Instant voice cloning by MIT and MyShell.
opendatalab/MinerU
A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。
fishaudio/fish-speech
SOTA Open Source TTS
IDEA-Research/Grounded-Segment-Anything
Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything
BradyFU/Awesome-Multimodal-Large-Language-Models
:sparkles::sparkles:Latest Advances on Multimodal Large Language Models
OpenMOSS/MOSS
An open-source tool-augmented conversational language model from Fudan University
LianjiaTech/BELLE
BELLE: Be Everyone's Large Language model Engine(开源中文对话大模型)
jaywalnut310/vits
VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech
wgwang/awesome-LLMs-In-China
**大模型
modelscope/FunClip
Open-source, accurate and easy-to-use video speech recognition & clipping tool, LLM based AI clipping intergrated.
nghuyong/WeiboSpider
持续维护的新浪微博采集工具🚀🚀🚀
facebookresearch/encodec
State-of-the-art deep learning based audio codec supporting both mono 24 kHz audio and stereo 48 kHz audio.
Zjh-819/LLMDataHub
A quick guide (especially) for trending instruction finetuning datasets
taishi-i/awesome-ChatGPT-repositories
A curated list of resources dedicated to open source GitHub repositories related to ChatGPT
lifeiteng/vall-e
PyTorch implementation of VALL-E(Zero-Shot Text-To-Speech), Reproduced Demo https://lifeiteng.github.io/valle/index.html
TapXWorld/ChinaTextbook
所有小初高、大学PDF教材。
0nutation/SpeechGPT
SpeechGPT Series: Speech Large Language Models
PyThaiNLP/pythainlp
Thai natural language processing in Python
jiayev/GPT4V-Image-Captioner
jianzhnie/awesome-instruction-datasets
A collection of awesome-prompt-datasets, awesome-instruction-dataset, to train ChatLLM such as chatgpt 收录各种各样的指令数据集, 用于训练 ChatLLM 模型。
facebookresearch/audioseal
Localized watermarking for AI-generated speech audios, with SOTA on robustness and very fast detector
Yuliang-Liu/MultimodalOCR
On the Hidden Mystery of OCR in Large Multimodal Models (OCRBench)
langgptai/Awesome-Multimodal-Prompts
Prompts of GPT-4V & DALL-E3 to full utilize the multi-modal ability. GPT4V Prompts, DALL-E3 Prompts.
CLUEbenchmark/SuperCLUElyb
SuperCLUE琅琊榜:中文通用大模型匿名对战评价基准
micbuffa/WasabiDataset
Repo for the Wasabi datasets
FreedomIntelligence/MLLM-Bench
MLLM-Bench: Evaluating Multimodal LLMs with Per-sample Criteria
R1ckShi/AESRC2020
[ICASSP2021] Data preperation scripts, training pipeline and baseline experiment results for the Interspeech 2020 Accented English Speech Recognition Challenge (AESRC).
attapol/tltk
Thai Language Toolkit
ag1988/mel-asr
The accompanying code for "Exploring the limits of decoder-only models trained on public speech recognition corpora" (Ankit Gupta, George Saon, Brian Kingsbury. Interspeech 2024).