JimLee4530
postgraduate at CS Department, HangZhou Dianzi University.
Media Intelligence Laboratory(MIL@HDU)HangZhou,China
JimLee4530's Stars
yt-dlp/yt-dlp
A feature-rich command-line audio/video downloader
tiangolo/fastapi
FastAPI framework, high performance, easy to learn, fast to code, ready for production
labmlai/annotated_deep_learning_paper_implementations
🧑🏫 60+ Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), gans(cyclegan, stylegan2, ...), 🎮 reinforcement learning (ppo, dqn), capsnet, distillation, ... 🧠
xai-org/grok-1
Grok open release
geekan/MetaGPT
🌟 The Multi-Agent Framework: First AI Software Company, Towards Natural Language Programming
hpcaitech/Open-Sora
Open-Sora: Democratizing Efficient Video Production for All
harry0703/MoneyPrinterTurbo
利用AI大模型,一键生成高清短视频 Generate short videos with one click using AI LLM.
cumulo-autumn/StreamDiffusion
StreamDiffusion: A Pipeline-Level Solution for Real-Time Interactive Generation
xxlllq/system_architect
:100: 2024年系统架构设计师(软考高级)备考资料。
levihsu/OOTDiffusion
Official implementation of OOTDiffusion: Outfitting Fusion based Latent Diffusion for Controllable Virtual Try-on
Zejun-Yang/AniPortrait
AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animation
VAST-AI-Research/TripoSR
fudan-generative-vision/champ
Champ: Controllable and Consistent Human Image Animation with 3D Parametric Guidance
openai/transformer-debugger
alibaba-damo-academy/FunASR
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
geekan/scrapy-examples
Multifarious Scrapy examples. Spiders for alexa / amazon / douban / douyu / github / linkedin etc.
TMElyralab/MuseTalk
MuseTalk: Real-Time High Quality Lip Synchorization with Latent Space Inpainting
TMElyralab/MuseV
MuseV: Infinite-length and High Fidelity Virtual Human Video Generation with Visual Conditioned Parallel Denoising
harlanhong/awesome-talking-head-generation
Picsart-AI-Research/StreamingT2V
StreamingT2V: Consistent, Dynamic, and Extendable Long Video Generation from Text
AILab-CVC/UniRepLKNet
[CVPR'24] UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio, Video, Point Cloud, Time-Series and Image Recognition
mayuelala/FollowYourClick
[arXiv 2024] Follow-Your-Click: This repo is the official implementation of "Follow-Your-Click: Open-domain Regional Image Animation via Short Prompts"
sail-sg/MDT
Masked Diffusion Transformer is the SOTA for image synthesis. (ICCV 2023)
TIGER-AI-Lab/AnyV2V
Code and data for "AnyV2V: A Tuning-Free Framework For Any Video-to-Video Editing Tasks"
MStypulkowski/diffused-heads
Official repository for Diffused Heads: Diffusion Models Beat GANs on Talking-Face Generation
whlzy/FiT
[ICML 2024 Spotlight] FiT: Flexible Vision Transformer for Diffusion Model
TIGER-AI-Lab/ConsistI2V
ConsistI2V: Enhancing Visual Consistency for Image-to-Video Generation (TMLR 2024)
PatrickZH/DeepCore
Code for coreset selection methods
johndpope/Emote-hack
Emote Portrait Alive - using ai to reverse engineer code from white paper. (abandoned)
AGI-Edgerunners/IIL
Code for our Paper "All in an Aggregated Image for In-Image Learning"