zzxxchen's Stars
microsoft/markitdown
Python tool for converting files and office documents to Markdown.
pdulvp/jellyfin-qnap
Jellyfin server packaging for QNAP NAS
cyubuchen/TikTok_Unlock
TikTok解锁+换区+直播+无水印视频下载
DaoCloud/public-image-mirror
很多镜像都在国外。比如 gcr 。国内下载很慢,需要加速。致力于提供连接全世界的稳定可靠安全的容器镜像服务。
Jeric-X/SyncClipboard
跨平台剪贴板同步方案 / Cross-Platform Cipboard Syncing Solution
Nutlope/llama-ocr
Document to Markdown OCR library with Llama 3.2 vision
hanxi/xiaomusic
使用小爱音箱播放音乐,音乐使用 yt-dlp 下载。
SWivid/F5-TTS
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
getomni-ai/zerox
PDF to Markdown with vision models
Dicklesworthstone/llm_aided_ocr
Enhance Tesseract OCR output for scanned PDFs by applying Large Language Model (LLM) corrections.
VikParuchuri/surya
OCR, layout analysis, reading order, table recognition in 90+ languages
CatchTheTornado/text-extract-api
Document (PDF, Word, PPTX ...) extraction and parse API using state of the art modern OCRs + Ollama supported models. Anonymize documents. Remove PII. Convert any document or picture to structured JSON or Markdown
shallowdream204/DreamClear
[NeurIPS 2024🔥] DreamClear: High-Capacity Real-World Image Restoration with Privacy-Safe Dataset Curation
hiroi-sora/Umi-OCR
OCR software, free and offline. 开源、免费的离线OCR软件。支持截屏/批量导入图片,PDF文档识别,排除水印/页眉页脚,扫描/生成二维码。内置多国语言库。
decaywood/XueQiuSuperSpider
雪球股票信息超级爬虫
myshell-ai/OpenVoice
Instant voice cloning by MIT and MyShell. Audio foundation model.
BluePointLilac/ContextMenuManager
🖱️ 纯粹的Windows右键菜单管理程序
lissettecarlr/speaker-diarization
将视频中不同说话人的声音提取后区分保存,得到音频训练数据
modelscope/3D-Speaker
A Repository for Single- and Multi-modal Speaker Verification, Speaker Recognition and Speaker Diarization
Spr-Aachen/Easy-Voice-Toolkit
可本地部署的AI语音工具箱 | A user-friendly audio toolkit for voice recognition, voice transcription, voice conversion etc.
svcvit/Awesome-Dify-Workflow
分享一些好用的 Dify DSL 工作流程,自用、学习两相宜。 Sharing some Dify workflows.
Rudrabha/Wav2Lip
This repository contains the codes of "A Lip Sync Expert Is All You Need for Speech to Lip Generation In the Wild", published at ACM Multimedia 2020. For HD commercial model, please try out Sync Labs
asweigart/pyautogui
A cross-platform GUI automation Python module for human beings. Used to programmatically control the mouse & keyboard.
xfangfang/Macast
Macast is a cross-platform application which using mpv as DLNA Media Renderer.
Saroth/docker_wechat
这是一个在Linux系统下,使用容器运行微信的方案,基于WeChatFerry部署
jianchang512/clone-voice
A sound cloning tool with a web interface, using your voice or any sound to record audio / 一个带web界面的声音克隆工具,使用你的音色或任意声音来录制音频
jianchang512/pyvideotrans
Translate the video from one language to another and add dubbing. 将视频从一种语言翻译为另一种语言,同时支持语音识别转录、语音合成、字幕翻译。
Huanshere/VideoLingo
Netflix-level subtitle cutting, translation, alignment, and even dubbing - one-click fully automated AI video subtitle team | Netflix级字幕切割、翻译、对齐、甚至加上配音,一键全自动视频搬运AI字幕组
beclab/Olares
Olares: An Open-Source Sovereign Cloud OS for Local AI
GuijiAI/ReHiFace-S
Real Time High-Fidelity Faceswap