willyzw1221's Stars
fudan-generative-vision/hallo2
Hallo2: Long-Duration and High-Resolution Audio-driven Portrait Image Animation
SWivid/F5-TTS
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
pkunlp-icler/FastV
[ECCV 2024 Oral] Code for paper: An Image is Worth 1/2 Tokens After Layer 2: Plug-and-Play Inference Acceleration for Large Vision-Language Models
GeorgeLuImmortal/PaDeLLM_NER
langchain-ai/open-canvas
📃 A better UX for chat, writing content, and coding with LLMs.
YoMio-Tech-Inc/GPT-SoVITS2
GPT-SoVITS2
HKUDS/LightRAG
"LightRAG: Simple and Fast Retrieval-Augmented Generation"
Leymore/ruozhiba
jy0205/Pyramid-Flow
Code of Pyramidal Flow Matching for Efficient Video Generative Modeling
The-Run-Philosophy-Organization/run
润学全球官方指定GITHUB,整理润学宗旨、纲领、理论和各类润之实例;解决为什么润,润去哪里,怎么润三大问题; 并成为新**人的核心宗教,核心信念。
ToTheBeginning/PuLID
[NeurIPS 2024] Official code for PuLID: Pure and Lightning ID Customization via Contrastive Alignment
LinglongQian/Medical-Graph-RAG
Medical Graph RAG: Graph RAG for the Medical Data
Gsllchb/Handright
A lightweight Python library for simulating Chinese handwriting
gpt-omni/mini-omni
open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.
facebookresearch/sapiens
High-resolution models for human tasks.
openai/baselines
OpenAI Baselines: high-quality implementations of reinforcement learning algorithms
NExT-GPT/NExT-GPT
Code and models for NExT-GPT: Any-to-Any Multimodal Large Language Model
Doubiiu/ToonCrafter
[SIGGRAPH Asia 2024, Journal Track] ToonCrafter: Generative Cartoon Interpolation
Arthurzhangsheng/echomimic-all-in-one-package
echomimic免环境安装windows一体包,解压即用|echomimic environment-free installation Windows all-in-one package, ready to use after extraction
Azure-Samples/graphrag-accelerator
One-click deploy of a Knowledge Graph powered RAG (GraphRAG) in Azure
hacksider/Deep-Live-Cam
real time face swap and one-click video deepfake with only a single image
BadToBest/EchoMimic
Lifelike Audio-Driven Portrait Animations through Editable Landmark Conditioning
jianchang512/pyvideotrans
Translate the video from one language to another and add dubbing. 将视频从一种语言翻译为另一种语言,并支持api调用
LayTextLLM/LayTextLLM
danielgatis/rembg
Rembg is a tool to remove images background
Stirling-Tools/Stirling-PDF
#1 Locally hosted web application that allows you to perform various operations on PDF files
microsoft/graphrag
A modular graph-based Retrieval-Augmented Generation (RAG) system
fanmingming/live
✯ 可直连访问的电视/广播图标库与相关工具项目 ✯ 🔕 永久免费 直连访问 完整开源 不断完善的台标 支持IPv4/IPv6双栈访问 🔕
microsoft/table-transformer
Table Transformer (TATR) is a deep learning model for extracting tables from unstructured documents (PDFs and images). This is also the official repository for the PubTables-1M dataset and GriTS evaluation metric.
KwaiVGI/LivePortrait
Bring portraits to life!