Pinned Repositories
Ctrl-Adapter
Official implementation of Ctrl-Adapter: An Efficient and Versatile Framework for Adapting Diverse Controls to Any Diffusion Model
DeepPavlov
An open source library for deep learning end-to-end dialog systems and chatbots.
face_mask_dataset
manually labeled faces and mask-weared faces dataset(人工标注的人脸与戴口罩人脸数据集,用于目标检测模型)
gradio
Create UIs for prototyping your machine learning model in 3 minutes
node-serialport
Access serial ports with JavaScript. Linux, OSX and Windows. Welcome your robotic JavaScript overlords. Better yet, program them!
phidata
Add memory, knowledge and tools to LLMs
searxng
SearXNG is a free internet metasearch engine which aggregates results from various search services and databases. Users are neither tracked nor profiled.
speech-driven-animation
SRGAN
Torch implementation of SRGAN (Ledig et al., Photo -Realistic Single Image Super-Resolution Using a Generative Adversarial Network, 2016)
Upscale-A-Video
Upscale-A-Video: Temporal-Consistent Diffusion Model for Real-World Video Super-Resolution
anthonyyuan's Repositories
anthonyyuan/ai-toolkit
Various AI scripts. Mostly Stable Diffusion stuff.
anthonyyuan/AnchorCrafter
anthonyyuan/anonymous_code
DroneSplat: 3D Gaussian Splatting for Robust 3D Reconstruction from In-the-Wild Drone Imagery
anthonyyuan/browser-use
Make websites accessible for AI agents
anthonyyuan/ComfyUI-MochiEdit
ComfyUI nodes to edit videos using Genmo Mochi
anthonyyuan/ComfyUI-MochiWrapper
anthonyyuan/ComfyUI-OmniGen
ComfyUI-OmniGen - A ComfyUI custom node implementation of OmniGen, a powerful text-to-image generation and editing model.
anthonyyuan/convex-splatting
Original implementation of "3D Convex Splatting: Radiance Field Rendering with 3D Smooth Convexes"
anthonyyuan/dify
Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.
anthonyyuan/DimensionX
DimensionX: Create Any 3D and 4D Scenes from a Single Image with Controllable Video Diffusion
anthonyyuan/DisPose
This repository is the official implementation of "DisPose: Disentangling Pose Guidance for Controllable Human Image Animation"
anthonyyuan/EasyAnimate
📺 An End-to-End Solution for High-Resolution and Long Video Generation Based on Transformer Diffusion
anthonyyuan/echomimic_v2
EchoMimicV2: Towards Striking, Simplified, and Semi-Body Human Animation
anthonyyuan/efficientvit
Efficient vision foundation models for high-resolution generation and perception.
anthonyyuan/Fay
Fay is an open-source digital human framework integrating language models and digital characters. It offers retail, assistant, and agent versions for diverse applications like virtual shopping guides, broadcasters, assistants, waiters, teachers, and voice or text-based mobile assistants.
anthonyyuan/FlipSketch
FlipSketch: Flipping Static Drawings to Text-Guided Sketch Animations
anthonyyuan/gazelle
anthonyyuan/Genesis
A generative world for general-purpose robotics & embodied AI learning.
anthonyyuan/In-Context-LoRA
Official repository of In-Context LoRA for Diffusion Transformers
anthonyyuan/Leffa
Learning Flow Fields in Attention for Controllable Person Image Generation
anthonyyuan/new-api
AI模型接口管理与分发系统,支持将多种大模型转为OpenAI格式调用、支持Midjourney Proxy、Suno、Rerank,兼容易支付协议,仅供个人或者企业内部管理与分发渠道使用,请勿用于商业用途,本项目基于One API二次开发。
anthonyyuan/OmniGen
OmniGen: Unified Image Generation. https://arxiv.org/pdf/2409.11340
anthonyyuan/PAI-RAG
An easy-to-use framework for modular RAG
anthonyyuan/Regional-Prompting-FLUX
Training-free Regional Prompting for Diffusion Transformers 🔥
anthonyyuan/samurai
Official repository of "SAMURAI: Adapting Segment Anything Model for Zero-Shot Visual Tracking with Motion-Aware Memory"
anthonyyuan/screenpipe
rewind.ai x cursor.com = your AI assistant that has all the context. 24/7 screen & voice recording for the age of super intelligence. get your data ready or be left behind
anthonyyuan/servers
Model Context Protocol Servers
anthonyyuan/StableV2V
The official implementation of the paper titled "StableV2V: Stablizing Shape Consistency in Video-to-Video Editing".
anthonyyuan/VistaDream
[arXiv'24] VistaDream: Sampling multiview consistent images for single-view scene reconstruction
anthonyyuan/x.infer
Framework agnostic computer vision inference.