Pinned Repositories
ASR
GPT-SoVITS
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
impala
hbase-clouder-0.94.6
jaxrl
JAX (Flax) implementation of algorithms for Deep Reinforcement Learning with continuous action spaces.
LLaSM
第一个支持中英文双语语音-文本多模态对话的开源可商用对话模型。便捷的语音输入将大幅改善以文本为输入的大模型的使用体验,同时避免了基于 ASR 解决方案的繁琐流程以及可能引入的错误。
PunctuationModel
中文标点符号模型,可以给文本添加标点符号。
Reinforcement-learning-with-tensorflow
Simple Reinforcement learning tutorials
self-llm
《开源大模型食用指南》基于AutoDL快速部署开源大模型,更适合中国宝宝的部署教程
tensorflow-with-kenlm
Tensorflow with KenLM integrated for beam search scoring
VTuberTalk
shiyuzh2007's Repositories
shiyuzh2007/GPT-SoVITS
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
shiyuzh2007/3D-Speaker
A repository for single- and multi-modal speaker verification, speaker recognition and speaker diarization.
shiyuzh2007/agent-lightning
The absolute trainer to light up AI agents.
shiyuzh2007/ART
Agent Reinforcement Trainer: train multi-step agents for real-world tasks using GRPO. Give your agents on-the-job training. Reinforcement learning for Qwen2.5, Qwen3, Llama, and more!
shiyuzh2007/awesome-chatgpt
A curated list of awesome ChatGPT related projects.
shiyuzh2007/data-juicer
Data processing for and with foundation models! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷
shiyuzh2007/DeepResearch
Tongyi Deep Research, the Leading Open-source Deep Research Agent
shiyuzh2007/dify
Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.
shiyuzh2007/e2m
E2M converts various file types (doc, docx, epub, html, htm, url, pdf, ppt, pptx, mp3, m4a) into Markdown. It’s easy to install, with dedicated parsers and converters, supporting custom configs. E2M offers an all-in-one, flexible, and open-source solution.
shiyuzh2007/facechain
FaceChain is a deep-learning toolchain for generating your Digital-Twin.
shiyuzh2007/FunASR
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models.
shiyuzh2007/Isaac-GR00T
NVIDIA Isaac GR00T N1.5 is the world's first open foundation model for generalized humanoid robot reasoning and skills.
shiyuzh2007/langmanus
A community-driven AI automation framework that builds upon the incredible work of the open source community. Our goal is to combine language models with specialized tools for tasks like web search, crawling, and Python code execution, while giving back to the community that made this possible.
shiyuzh2007/LogicRAG
Implementation of Logic-RAG
shiyuzh2007/MaliangAINovalWriter
马良AI写作是一个专为小说作者与平台运营者设计的智能化创作平台。它结合了强大的AI模型(支持OpenAI, Gemini, Anthropic等)与专业的在线富文本编辑器,旨在帮助作者激发灵感、提高写作效率、管理创作内容,同时为平台管理员提供了强大的后台管理与监控功能。
shiyuzh2007/MinerU
A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。
shiyuzh2007/open-editor
一个基于Web的本土化开源智能编辑器平台,支持富文本及其他多种文档类型编辑和预览,包括 Word、Excel、PPT、Markdown、思维导图和流程图。
shiyuzh2007/Open-Sora
Open-Sora: Democratizing Efficient Video Production for All
shiyuzh2007/OpenManus
No fortress, purely open ground. OpenManus is Coming.
shiyuzh2007/openvla
OpenVLA: An open-source vision-language-action model for robotic manipulation.
shiyuzh2007/promptMinder
一个开源的,专注于提示词管理的平台
shiyuzh2007/RAG-Anything
"RAG-Anything: All-in-One RAG System"
shiyuzh2007/ragflow-plus
Ragflow-Plus 是 Ragflow 的二次开发版本,使其更为简洁实用
shiyuzh2007/ROLL
An Efficient and User-Friendly Scaling Library for Reinforcement Learning with Large Language Models
shiyuzh2007/Search-o1
🔍 Search-o1: Agentic Search-Enhanced Large Reasoning Models [EMNLP 2025]
shiyuzh2007/UI-TARS-desktop
The Open-sourced Multimodal AI Agent Stack connecting Cutting-edge AI Models and Agent Infra.
shiyuzh2007/unstructured
Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
shiyuzh2007/verl
verl: Volcano Engine Reinforcement Learning for LLMs
shiyuzh2007/WebThinker
🌐 WebThinker: Empowering Large Reasoning Models with Deep Research Capability
shiyuzh2007/wukong-robot
🤖 wukong-robot 是一个简单、灵活、优雅的中文语音对话机器人/智能音箱项目,支持ChatGPT多轮对话能力,还可能是首个支持脑机交互的开源智能音箱项目。