Pinned Repositories
2015-terminal-interview-git
SYSU Apple Club - 2015 - Terminal Department - Second Round Interview - Git Learning
aphantasia
CLIP + FFT/DWT/RGB = text to image/video
Ask-Anything
[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.
books
useful books
BoxDiff
[ICCV 2023] BoxDiff: Text-to-Image Synthesis with Training-Free Box-Constrained Diffusion
CLIP
Contrastive Language-Image Pretraining
Crop-CLIP
Crop using CLIP
Distributed-PC-Darts
Distributed implementation of PC-Darts.This code is based on the implementation of PC-Darts, it is able to searching and training on multi-nodes&multi-gpu with the method of distributed data parallel.Only the distributed search and retrain on Cifar10 implemented, you can modify it for your own datasets.
glide-text2im
GLIDE: a diffusion-based text-conditional image synthesis model
SalMetric
An evaluation python program for salient map result
yangbinb's Repositories
yangbinb/aphantasia
CLIP + FFT/DWT/RGB = text to image/video
yangbinb/Ask-Anything
[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.
yangbinb/BoxDiff
[ICCV 2023] BoxDiff: Text-to-Image Synthesis with Training-Free Box-Constrained Diffusion
yangbinb/CLIP_prefix_caption
Simple image captioning model
yangbinb/CogVideo
Text-to-video generation. The repo for ICLR2023 paper "CogVideo: Large-scale Pretraining for Text-to-Video Generation via Transformers"
yangbinb/CoDeF
Official PyTorch implementation of CoDeF: Content Deformation Fields for Temporally Consistent Video Processing
yangbinb/ComfyUI-DragNUWA
yangbinb/ComfyUI-Marigold
Marigold depth estimation in ComfyUI
yangbinb/CVPR23_LFDM
The pytorch implementation of our CVPR 2023 paper "Conditional Image-to-Video Generation with Latent Flow Diffusion Models"
yangbinb/DirectInversion
Official repo for paper "Direct Inversion: Boosting Diffusion-based Editing with 3 Lines of Code"
yangbinb/Director3D
Code for "Director3D: Real-world Camera Trajectory and 3D Scene Generation from Text".
yangbinb/DragNUWA
yangbinb/DynamiCrafter
DynamiCrafter: Animating Open-domain Images with Video Diffusion Priors
yangbinb/FIFO-Diffusion_public
Official implementation of FIFO-Diffusion
yangbinb/lorahub
yangbinb/mistral-src
Reference implementation of Mistral AI 7B v0.1 model.
yangbinb/Omost
Your image is almost there!
yangbinb/Open-Sora
Building your own video generation model like OpenAI's Sora
yangbinb/rich-text-to-image
Rich-Text-to-Image Generation
yangbinb/stable-diffusion
yangbinb/svd-temporal-controlnet
yangbinb/T-Rex
T-Rex2: Towards Generic Object Detection via Text-Visual Prompt Synergy
yangbinb/Text-To-Video-Finetuning
Finetune ModelScope's Text To Video model using Diffusers 🧨
yangbinb/TrackDiffusion
Official PyTorch implementation of TrackDiffusion (https://arxiv.org/abs/2312.00651)
yangbinb/TTNet-Real-time-Analysis-System-for-Table-Tennis-Pytorch
Unofficial implementation of "TTNet: Real-time temporal and spatial video analysis of table tennis" (CVPR 2020)
yangbinb/vector-quantize-pytorch
Vector Quantization, in Pytorch
yangbinb/Video-BLIP2-Preprocessor
A simple script that reads a directory of videos, grabs a random frame, and automatically discovers a prompt for it
yangbinb/vidmaestro.github.io
yangbinb/WaveDiff
Official Pytorch Implementation of the paper: Wavelet Diffusion Models are fast and scalable Image Generators (CVPR'23)
yangbinb/xtuner
An efficient, flexible and full-featured toolkit for fine-tuning large models (InternLM, Llama, Baichuan, Qwen, ChatGLM)