boboyiyi's Stars
black-forest-labs/flux
Official inference repo for FLUX.1 models
ExistentialAudio/BlackHole
BlackHole is a modern macOS audio loopback driver that allows applications to pass audio to other applications with zero additional latency.
facebookresearch/segment-anything-2
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
idealvin/coost
A tiny boost library in C++11.
BadToBest/EchoMimic
EchoMimic: Lifelike Audio-Driven Portrait Animations through Editable Landmark Conditioning
PeterH0323/Streamer-Sales
Streamer-Sales 销冠 —— 卖货主播 LLM 大模型🛒🎁,一个能够根据给定的商品特点从激发用户购买意愿角度出发进行商品解说的卖货主播大模型。🚀⭐内含详细的数据生成流程❗ 📦另外还集成了 LMDeploy 加速推理🚀、RAG检索增强生成 📚、TTS文字转语音🔊、数字人生成 🦸、 Agent 使用网络查询实时信息🌐、ASR 语音转文字🎙️、Vue 生态搭建前端🍍、FastAPI 搭建后端🗝️、Docker-compose 打包部署🐋
malinkang/weread2notion-pro
PowerHouseMan/ComfyUI-AdvancedLivePortrait
kijai/ComfyUI-LivePortraitKJ
ComfyUI nodes for LivePortrait
YangLing0818/RPG-DiffusionMaster
[ICML 2024] Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs (RPG)
karpathy/nano-llama31
nanoGPT style version of Llama 3.1
muzishen/IMAGDressing
[AAAI 2025]👔IMAGDressing👔: Interactive Modular Apparel Generation for Virtual Dressing. It enables customizable human image generation with flexible garment, pose, and scene control, ensuring high fidelity and garment consistency for virtual dressing.
OpenTeleVision/TeleVision
[CoRL 2024] Open-TeleVision: Teleoperation with Immersive Active Visual Feedback
facebookresearch/ocean
Ocean is the in-house framework for Computer Vision (CV) and Augmented Reality (AR) applications at Meta. It is platform independent and is mainly implemented in C/C++.
warmshao/FasterLivePortrait
Bring portraits to life in Real Time!onnx/tensorrt support!实时肖像驱动!
IDEA-Research/X-Pose
[ECCV 2024] Official implementation of the paper "X-Pose: Detecting Any Keypoints"
Vchitect/VEnhancer
Official codes of VEnhancer: Generative Space-Time Enhancement for Video Generation
huangyangyi/TeCH
[3DV 2024] Official repo of "TeCH: Text-guided Reconstruction of Lifelike Clothed Humans"
aim-uofa/MovieDreamer
OpenT2S/LlamaVoice
LlamaVoice is a llama-based large voice generation model, providing inference and training ability.
Francis-Rings/MotionFollower
MotionFollower: Editing Video Motion via Lightweight Score-Guided Diffusion
yanivw12/gs2mesh
[ECCV 2024] Official implementation of the paper "GS2Mesh: Surface Reconstruction from Gaussian Splatting via Novel Stereo Views"
aihacker111/Efficient-Live-Portrait
Fast running Live Portrait with TensorRT and ONNX models
bornfly-detachment/asymmetric_magvitv2
In 2024, the strongest open-source implementation of asymmetric magvit_v2 supports inference code but excludes VQVAE. It supports the joint encoding of images and videos, accommodating arbitrary video lengths and resolutions. It surpasses all open-source models in FID and FVD, with 4z and 16z models available on huggingface.
guoqincode/Focus-on-Your-Instruction
[CVPR 2024] Focus on Your Instruction: Fine-grained and Multi-instruction Image Editing by Attention Modulation
KeyuWu-CS/MonoHair
Code of MonoHair: High-Fidelity Hair Modeling from a Monocular Video
XuanchenLi/Topo4D
[ECCV 2024] Official implementation of Topo4D: Topology-Preserving Gaussian Splatting for High-Fidelity 4D Head Capture
asw91666/TRG-Release
Official PyTorch implementation of "6DoF Head Pose Estimation through Explicit Bidirectional Interaction with Face Geometry," ECCV 2024
TencentQQGYLab/LinguaLinker
LinguaLinker: Audio-Driven Portraits Animation with Implicit Facial Control Enhancement
eccv2024tcan/TCAN