CatherineZhou's Stars
PaddlePaddle/PaddleOCR
Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)
lllyasviel/ControlNet
Let us control diffusion models!
google-ai-edge/mediapipe
Cross-platform, customizable ML solutions for live and streaming media.
iperov/DeepFaceLive
Real-time face swap for PC streaming or video calls
infiniflow/ragflow
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
lllyasviel/style2paints
sketch + style = paints :art: (TOG2018/SIGGRAPH2018ASIA)
stanford-oval/storm
An LLM-powered knowledge curation system that researches a topic and generates a full-length report with citations.
SYSTRAN/faster-whisper
Faster Whisper transcription with CTranslate2
netease-youdao/QAnything
Question and Answer based on Anything.
FlagOpen/FlagEmbedding
Retrieval and Retrieval-augmented LLMs
PeterL1n/BackgroundMattingV2
Real-Time High-Resolution Background Matting
HVision-NKU/StoryDiffusion
Accepted as [NeurIPS 2024] Spotlight Presentation Paper
YaoFANGUK/video-subtitle-remover
基于AI的图片/视频硬字幕去除、文本水印去除,无损分辨率生成去字幕、去水印后的图片/视频文件。无需申请第三方API,本地实现。AI-based tool for removing hard-coded subtitles and text-like watermarks from videos or Pictures.
yisol/IDM-VTON
[ECCV2024] IDM-VTON : Improving Diffusion Models for Authentic Virtual Try-on in the Wild
MahmoudAshraf97/whisper-diarization
Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper
TencentARC/InstantMesh
InstantMesh: Efficient 3D Mesh Generation from a Single Image with Sparse-view Large Reconstruction Models
TMElyralab/MuseTalk
MuseTalk: Real-Time High Quality Lip Synchorization with Latent Space Inpainting
facebookresearch/DPR
Dense Passage Retriever - is a set of tools and models for open domain Q&A task.
jrottenberg/ffmpeg
Docker build for FFmpeg on Ubuntu / Alpine / Centos / Scratch / nvidia / vaapi
ZiqiaoPeng/SyncTalk
[CVPR 2024] This is the official source for our paper "SyncTalk: The Devil is in the Synchronization for Talking Head Synthesis"
RUC-NLPIR/FlashRAG
⚡FlashRAG: A Python Toolkit for Efficient RAG Research
hao-ai-lab/LookaheadDecoding
[ICML 2024] Break the Sequential Dependency of LLM Inference Using Lookahead Decoding
wuhuikai/FaceSwap
Swap face between two photos.
JinhuaLiang/WavCraft
Official repo for WavCraft, an AI agent for audio creation and editing
Hujiazeng/Vach
Real time streaming talking head
hao-ai-lab/Consistency_LLM
[ICML 2024] CLLMs: Consistency Large Language Models
HuskyInSalt/CRAG
Corrective Retrieval Augmented Generation
eric-ai-lab/swap-anything
"SwapAnything: Enabling Arbitrary Object Swapping in Personalized Visual Editing"
starsuzi/Adaptive-RAG
NanKeRen2020/UVR5_Linux
ultimate vocal remover application run on linux ubuntu1604