hw-liang

Ph.D. Candidate at University of Toronto hw.liang@mail.utoronto.ca

hw-liang's Stars

lucidrains/vit-pytorch
Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch
Language:Python21.3k 156 2693.1k
HumanAIGC/AnimateAnyone
Animate Anyone: Consistent and Controllable Image-to-Video Synthesis for Character Animation
14.6k 674 94980
khangich/machine-learning-interview
Machine Learning Interviews from FAANG, Snapchat, LinkedIn. I have offers from Snapchat, Coupang, Stitchfix etc. Blog: mlengineer.io.
10.1k 224 41.6k
lucidrains/denoising-diffusion-pytorch
Implementation of Denoising Diffusion Probabilistic Model in Pytorch
Language:Python8.7k 37 2981.1k
gaomingqi/Track-Anything
Track-Anything is a flexible and interactive tool for video object tracking and segmentation, based on Segment Anything, XMem, and E2FGVI.
Language:Python6.6k 62 140483
itcharge/LeetCode-Py
⛽️「算法通关手册」：超详细的「算法与数据结构」基础讲解教程，从零基础开始学习算法知识，850+ 道「LeetCode 题目」详细解析，200 道「大厂面试热门题目」。
Language:Python6.3k 39 191.1k
AILab-CVC/VideoCrafter
VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models
Language:Python4.6k 71 84352
showlab/Awesome-Video-Diffusion
A curated list of recent diffusion models for video generation, editing, restoration, understanding, etc.
3.7k 141 28212
fundamentalvision/BEVFormer
[ECCV 2022] This is the official implementation of BEVFormer, a camera-only framework for autonomous driving perception, e.g., 3D object detection and semantic map segmentation.
Language:Python3.5k 72 274562
aim-uofa/AdelaiDet
AdelaiDet is an open source toolbox for multiple instance-level detection and recognition tasks.
Language:Python3.4k 84 546653
google-research/scenic
Scenic: A Jax Library for Computer Vision Research and Beyond
Language:Python3.4k 40 268440
Doubiiu/DynamiCrafter
[ECCV 2024, Oral] DynamiCrafter: Animating Open-domain Images with Video Diffusion Priors
Language:Python2.7k 32 142217
SUDO-AI-3D/zero123plus
Code repository for Zero123++: a Single Image to Consistent Multi-view Diffusion Base Model.
Language:Python1.8k 30 81129
xiaobai1217/Awesome-Video-Datasets
Video datasets
1.3k 29 1296
CLAY-3D/OpenCLAY
CLAY: A Controllable Large-scale Generative Model for Creating High-quality 3D Assets
862 111 011
alibaba/animate-anything
Fine-Grained Open Domain Image Animation with Motion Guidance
Language:Python846 18 6465
WangYueFt/detr3d
Language:Python806 20 70158
jiawen-zhu/HQTrack
Tracking Anything in High Quality
Language:Python745 13 1962
bytedance/ibot
iBOT :robot:: Image BERT Pre-Training with Online Tokenizer (ICLR 2022)
Language:Jupyter Notebook689 6 3578
katerakelly/pytorch-maml
PyTorch implementation of MAML: https://arxiv.org/abs/1703.03400
Language:Jupyter Notebook555 10 21127
microsoft/esvit
EsViT: Efficient self-supervised Vision Transformers
Language:Python409 12 2744
VITA-Group/Diffusion4D
"Diffusion4D: Fast Spatial-temporal Consistent 4D Generation via Video Diffusion Models", Hanwen Liang*, Yuyang Yin*, Dejia Xu, Hanxue Liang, Zhangyang Wang, Konstantinos N. Plataniotis, Yao Zhao, Yunchao Wei
Language:Python258 9 144
junchen14/Multi-Modal-Transformer
The repository collects many various multi-modal transformer architectures, including image transformer, video transformer, image-language transformer, video-language transformer and self-supervised learning models. Additionally, it also collects many useful tutorials and tools in these related domains.
221 8 129
mzhaoshuai/CenterCLIP
[SIGIR 2022] CenterCLIP: Token Clustering for Efficient Text-Video Retrieval. Also, a text-video retrieval toolbox based on CLIP + fast pyav video decoding.
Language:Python128 3 46
layer6ai-labs/xpool
https://layer6ai-labs.github.io/xpool/
Language:Python118 9 239
TengdaHan/TemporalAlignNet
[CVPR'22 Oral] Temporal Alignment Networks for Long-term Video. Tengda Han, Weidi Xie, Andrew Zisserman.
Language:Python114 12 124
sail-sg/mugs
A PyTorch implementation of Mugs proposed by our paper "Mugs: A Multi-Granular Self-Supervised Learning Framework".
Language:Python83 9 410
VITA-Group/Comp4D
"Comp4D: Compositional 4D Scene Generation", Dejia Xu*, Hanwen Liang*, Neel P. Bhatt, Hezhen Hu, Hanxue Liang, Konstantinos N. Plataniotis, and Zhangyang Wang
Language:Python76 5 41
princeton-nlp/DataMUX
[NeurIPS 2022] DataMUX: Data Multiplexing for Neural Networks
Language:Jupyter Notebook60 6 19
elisakreiss/concadia
Language:Python16 2 02

hw-liang

hw-liang's Stars

lucidrains/vit-pytorch

HumanAIGC/AnimateAnyone

khangich/machine-learning-interview

lucidrains/denoising-diffusion-pytorch

gaomingqi/Track-Anything

itcharge/LeetCode-Py

AILab-CVC/VideoCrafter

showlab/Awesome-Video-Diffusion

fundamentalvision/BEVFormer

aim-uofa/AdelaiDet

google-research/scenic

Doubiiu/DynamiCrafter

SUDO-AI-3D/zero123plus

xiaobai1217/Awesome-Video-Datasets

CLAY-3D/OpenCLAY

alibaba/animate-anything

WangYueFt/detr3d

jiawen-zhu/HQTrack

bytedance/ibot

katerakelly/pytorch-maml

microsoft/esvit

VITA-Group/Diffusion4D

junchen14/Multi-Modal-Transformer

mzhaoshuai/CenterCLIP

layer6ai-labs/xpool

TengdaHan/TemporalAlignNet

sail-sg/mugs

VITA-Group/Comp4D

princeton-nlp/DataMUX

elisakreiss/concadia