hzhang57

Postdoc Fellow, CS, City University of Hong Kong

Pinned Repositories

2prime.github.io
Github Pages template for academic personal websites, forked from mmistakes/minimal-mistakes
Language:JavaScript0 0 00
2s-AGCN
Two-Stream Adaptive Graph Convolutional Networks for Skeleton-Based Action Recognition in CVPR19
Language:Python0 1 00
3D-Human-Body-Shape
Language:Python0 1 00
3d-pose-baseline
A simple baseline for 3d human pose estimation in tensorflow. Presented at ICCV 17.
Language:Python0 1 00
action-detection
temporal action detection with SSN
Language:Python1 1 00
dense_flow
Tools to extract dense optical flow from videos, based on OpenCV
Language:C++1 1 00
repulsion_loss_ssd
Repulsion Loss: Detecting Pedestrians in a Crowd. https://arxiv.org/abs/1711.07752
Language:Python3 2 00
TrajectoryNet-1
Language:Python1 1 00

hzhang57's Repositories

hzhang57/hzhang57.github.io
Language:HTML1 2 00
hzhang57/2prime.github.io
Github Pages template for academic personal websites, forked from mmistakes/minimal-mistakes
Language:JavaScript0 0 00
hzhang57/Awesome-CLIP
Awesome list for research on CLIP (Contrastive Language-Image Pre-Training).
0 0
hzhang57/awesome-vision-language-pretraining-papers
Recent Advances in Vision and Language PreTrained Models (VL-PTMs)
0 0
hzhang57/behave-dataset
code to access BEHAVE dataset
Language:Python1 0
hzhang57/chain-of-thought-hub
Benchmarking large language models' complex reasoning ability with chain-of-thought prompting
Language:Jupyter Notebook0 0
hzhang57/CLIP
Contrastive Language-Image Pretraining
Language:Jupyter Notebook0 0
hzhang57/CogVideo
Text-to-video generation.
0 0
hzhang57/coyo-dataset
COYO-700M: Large-scale Image-Text Pair Dataset
Language:Python0 0
hzhang57/GLIP
Grounded Language-Image Pre-training
Language:Python0 0
hzhang57/Group-Contextualization
[CVPR22] Group Contextualization for Video Recognition
Language:Python1 0
hzhang57/GSS
[CVPR 2023] Official repository of Generative Semantic Segmentation
Language:Python0 0
hzhang57/HowToLiveLonger
程序员延寿指南 | A programmer's guide to live longer
1 0
hzhang57/LaViLa
Code release for "Learning Video Representations from Large Language Models"
Language:Python0 0
hzhang57/lightning-sam
Fine-tune Segment-Anything Model with Lightning Fabric.
Language:Python0 0
hzhang57/Mask2Former
Code release for "Masked-attention Mask Transformer for Universal Image Segmentation"
Language:Python1 0
hzhang57/mega
Sequence modeling with Mega.
Language:Python0 0
hzhang57/METER
METER: A Multimodal End-to-end TransformER Framework
Language:Python0 0
hzhang57/multimodal-maestro
Effective prompting for Large Multimodal Models like GPT-4 Vision or LLaVA. 🔥
Language:Python0 0
hzhang57/Neighborhood-Attention-Transformer
[Preprint] Neighborhood Attention Transformer, 2022
Language:Python1 0
hzhang57/openai-cookbook
Examples and guides for using the OpenAI API
Language:Python0 0
hzhang57/Paper-Implementation-Template
A simple reproducible template to implement AI research papers
0 0
hzhang57/Pointcept
Pointcept: a codebase for point cloud perception research. Latest works: MSC, CeCo (CVPR 2023)
Language:Python0 0
hzhang57/pytorch_scatter
PyTorch Extension Library of Optimized Scatter Operations
Language:Python0 0
hzhang57/qna
[CVPR2022 - Oral] Official Jax Implementation of Learned Queries for Efficient Local Attention
Language:Python1 0
hzhang57/SimCLR
PyTorch implementation of SimCLR: A Simple Framework for Contrastive Learning of Visual Representations by T. Chen et al.
Language:Python0 0
hzhang57/VideoMAE
VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training
Language:Python1 0
hzhang57/vidt
Language:Python1 0
hzhang57/VILA
VILA - a multi-image visual language model with training, inference and evaluation recipe, deployable from cloud to edge (Jetson Orin and laptops)
Language:Python
hzhang57/X-Decoder
Official Implementation of X-Decoder for generalized decoding for pixel, image and language
Language:Python0 0