Pinned Repositories
1xN
1xN Block Pattern for Network Sparsity
AAL-pruning
Filter Pruning for Deep Convolutional Neural Networks via Auxiliary Attention
DW
A Dual Weighting Label Assignment Scheme for Object Detection
DyRep
Official implementation for paper "DyRep: Bootstrapping Training with Dynamic Re-parameterization", CVPR 2022
GASN
A Novel Guided Anchor Siamese Network for Arbitrary Target-Of-Interest Tracking in Video-SAR
LPNet-PyTorch
This repository is a PyTorch version of the paper "Luminance-aware Pyramid Network for Low-light Image Enhancement" (TMM 2020).
ResamplingNet
ResamplingNet: End-to-End Adaptive Feature Resampling Network for Real-Time Aerial Tracking
Restoring-Extremely-Dark-Images-In-Real-Time
The project is the official implementation of our CVPR 2021 paper, "Restoring Extremely Dark Images in Real Time"
StreamYOLO
Real-time Object Detection for Streaming Perception, CVPR 2022
Ultra-Fast-Lane-Detection-v2-plus
based on ufld-v2
scott-mao's Repositories
scott-mao/MobiLlama
MobiLlama : Small Language Model tailored for edge devices
scott-mao/A2J-Transformer
Code for paper 'A2J-Transformer: Anchor-to-Joint Transformer Network for 3D Interacting Hand Pose Estimation from a Single RGB Image', CVPR2023
scott-mao/Arbitrary-Hands-3D-Reconstruction
🔥(CVPR 2023) ACR: Attention Collaboration-based Regressor for Arbitrary Two-Hand Reconstruction
scott-mao/awesome-hand-pose-estimation
Awesome work on hand pose estimation/tracking
scott-mao/DIR
[ICCV 2023 Oral] Decoupled Iterative Refinement Framework for Interacting Hands Reconstruction from a Single RGB Image
scott-mao/EasyCV
An all-in-one toolkit for computer vision
scott-mao/EvLowLight
Coherent Event Guided Low-Light Video Enhancement
scott-mao/GeoChat
[CVPR 2024 🔥] GeoChat, the first grounded Large Vision Language Model for Remote Sensing
scott-mao/GPA-LM
This repo is a live list of papers on game playing and large multimodality model - "A Survey on Game Playing Agents and Large Models: Methods, Applications, and Challenges".
scott-mao/GPT-Eye
This repository is the official implementation of the paper "GPT-Eye: A Large Vision-Language Foundation Model for Ophthalmology with Clinical Knowledge"
scott-mao/hagrid
HAnd Gesture Recognition Image Dataset
scott-mao/HammerLLM
1.4B sLLM for Chinese and English - HammerLLM🔨
scott-mao/InfoBatch
Lossless Training Speed Up by Unbiased Dynamic Data Pruning
scott-mao/instruct-pix2pix
scott-mao/Light-Weight-Trackers
This is the official repository for the ICASSP 2023 paper "On Designing Light-Weight Object Trackers Through Network Pruning: Use CNNs or Transformers?"
scott-mao/LLaVA-Hound-DPO
scott-mao/LRD
Official implementation for our ICCV 2023 paper “Towards General Low-Light Raw Noise Synthesis and Modeling”
scott-mao/LVLM-LP
The First to Know: How Token Distributions Reveal Hidden Knowledge in Large Vision-Language Models?
scott-mao/MADTP
MADTP: Multimodal Alignment-Guided Dynamic Token Pruning for Accelerating Vision-Language Transformer
scott-mao/only_train_once
[ICLR 2023] OTOv2: Automatic, Generic, User-Friendly; [NeurIPS 2021] Only Train Once: A One-Shot Neural Network Training And Pruning Framework
scott-mao/PanoHead
Code Repository for CVPR 2023 Paper "PanoHead: Geometry-Aware 3D Full-Head Synthesis in 360 degree"
scott-mao/QA-ViT
scott-mao/RCD
scott-mao/RenderIH
Official PyTorch implementation of "RenderIH: A large-scale synthetic dataset for 3D interacting hand pose estimation", ICCV 2023
scott-mao/SILI_CD
Official Pytorch Implementation of “Continuous Cross-resolution Remote Sensing Image Change Detection”
scott-mao/SPIN
Repository for the paper "Learning to Reconstruct 3D Human Pose and Shape via Model-fitting in the Loop"
scott-mao/tinyllama-zh
A side project that pretrains a tinyllama on Chinese corpora, with the minimal modification to the huggingface transformers code.
scott-mao/TinyLLaVABench
A Framework of Small-scale Large Multimodal Models
scott-mao/UniRGB-IR
Official repo for UniRGB-IR.
scott-mao/Visual-CoT
Visual CoT: Unleashing Chain-of-Thought Reasoning in the Multi-Modal Language Model