xiaoxiang52's Stars
gkd-kit/gkd
基于无障碍,高级选择器,订阅规则的自定义屏幕点击 Android 应用 | An Android APP with custom screen tapping based on Accessibility, Advanced Selectors, and Subscription Rules
songquanpeng/one-api
LLM API 管理 & 分发系统,支持 OpenAI、Azure、Anthropic Claude、Google Gemini、DeepSeek、字节豆包、ChatGLM、文心一言、讯飞星火、通义千问、360 智脑、腾讯混元等主流模型,统一 API 适配,可用于 key 管理与二次分发。单可执行文件,提供 Docker 镜像,一键部署,开箱即用。LLM API management & key redistribution system, unifying multiple providers under a single API. Single binary, Docker-ready, with an English UI.
Calcium-Ion/new-api
AI模型接口管理与分发系统,支持将多种大模型转为统一格式调用,支持OpenAI、Claude等格式,可供个人或者企业内部管理与分发渠道使用,本项目基于One API二次开发。🍥 The next-generation LLM gateway and AI asset management system supports multiple languages.
microsoft/UFO
A UI-Focused Agent for Windows OS Interaction.
corbt/agent.exe
HaujetZhao/CapsWriter
一款电脑语音输入工具,运行后,按下大写锁定键超过 0.3 秒,就开始语音识别,松开按键之后,自动输入识别结果。
chidiwilliams/buzz
Buzz transcribes and translates audio offline on your personal computer. Powered by OpenAI's Whisper.
revdotcom/fstalign
An efficient OpenFST-based tool for calculating WER and aligning two transcript sequences.
pipecat-ai/pipecat
Open Source framework for voice and multimodal conversational AI
mudler/LocalAI
:robot: The free, Open Source alternative to OpenAI, Claude and others. Self-hosted and local-first. Drop-in replacement for OpenAI, running on consumer-grade hardware. No GPU required. Runs gguf, transformers, diffusers and many more models architectures. Features: Generate Text, Audio, Video, Images, Voice Cloning, Distributed, P2P inference
modelscope/ClearerVoice-Studio
An AI-Powered Speech Processing Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Enhancement, Separation, and Target Speaker Extraction, etc.
v3ucn/ASR_TOOLS_SenseVoice_WebUI
Bert-vits2转写和标注独立整合Webui,整合阿里FunAsr,必剪Asr以及Whisper大模型
isaacyan17/sensevoice-flutter
Asr Flutter是基于阿里开源的语音大模型SenseVoice开发的Flutter版本地离线语音识别插件
VignetteApril/VocalSearch
This project is a speech-to-text and file retrieval system using SenseVoice for audio transcription and Elasticsearch for file indexing and searching. Users can upload audio files, transcribe them to text, and search for file names in a specified directory based on the transcription results.
BrowserOrientedProgramer/ChatBot
SenseVoice实现音频转文字,Ollama Qwen2模型生成交流文本,ChatTTS实现文本转语音
FunAudioLLM/SenseVoice
Multilingual Voice Understanding Model
SpenserCai/ComfyUI-FunAudioLLM
Comfyui custom node for FunAudioLLM include CosyVoice and SenseVoice
WEIFENG2333/VideoCaptioner
🎬 卡卡字幕助手 | VideoCaptioner - 基于 LLM 的智能字幕助手 - 视频字幕生成、断句、校正、字幕翻译全流程处理!- A powered tool for easy and efficient video subtitling.
PyJun/Mooc_Downloader
学无止下载器,慕课下载器,Mooc网课下载,**大学慕课,网易云课堂,有道精品课,有道领世,腾讯课堂,腾讯会议,B站课堂,中公网校,伯索云,爱问云,高途,途途,研途,学浪,抖音课堂,千聊,兴趣岛,橙啦,超星学习通,学银在线,智慧职教,职教云,知到智慧树,学堂在线,爱课程;支持视频课件同时下载
wudududu/extract-video-ppt
extract the ppt in the video
aabi8113/video2slides
x-tropy/docRoll
Turn complex programming knowledge 📚 into engaging, AI-powered video 📺 lessons.
kachiO/slide-scraper
script to scrape presentation slides from youtube video
wangxingkang/video2docs
一个将视频转换为PPT的桌面应用。
AlephZeroConsulting/AI-PPT-Extraction
Extract PPT slides from a youtube video or any other video.
king-jingxiang/extract_video_ppt_to_markdown
Miss-Yhh/Extract-PPT-from-video
We can get the speaker's PPT through the video recorded by the speaker
mendableai/firecrawl
🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.
VirtualDrivers/Virtual-Display-Driver
Add virtual monitors to your windows 10/11 device! Works with VR, OBS, Sunshine, and/or any desktop sharing software.
taowen/awesome-lowcode
国内低代码平台从业者交流