1349949's Stars
openai/evals
Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.
mlfoundations/open_clip
An open source implementation of CLIP.
facebookresearch/DiT
Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"
facebookresearch/Mask2Former
Code release for "Masked-attention Mask Transformer for Universal Image Segmentation"
OpenGVLab/InternImage
[CVPR 2023 Highlight] InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions
mit-han-lab/bevfusion
[ICRA'23] BEVFusion: Multi-Task Multi-Sensor Fusion with Unified Bird's-Eye View Representation
baaivision/EVA
EVA Series: Visual Representation Fantasies from BAAI
IDEA-Research/DINO
[ICLR 2023] Official implementation of the paper "DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection"
ShoufaChen/DiffusionDet
[ICCV2023 Best Paper Finalist] PyTorch implementation of DiffusionDet (https://arxiv.org/abs/2211.09788)
MasterBin-IIAU/UNINEXT
[CVPR'23] Universal Instance Perception as Object Discovery and Retrieval
HuangJunJie2017/BEVDet
Code base of the BEVDet series .
MCG-NJU/VideoMAE
[NeurIPS 2022 Spotlight] VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training
chaytonmin/Awesome-BEV-Perception-Multi-Cameras
Awesome papers about Multi-Camera 3D Object Detection and Segmentation in Bird's-Eye-View, such as DETR3D, BEVDet, BEVFormer, BEVDepth, UniAD
megvii-research/PETR
[ECCV2022] PETR: Position Embedding Transformation for Multi-View 3D Object Detection & [ICCV2023] PETRv2: A Unified Framework for 3D Perception from Multi-Camera Images
Megvii-BaseDetection/BEVDepth
Official code for BEVDepth.
aharley/simple_bev
A Simple Baseline for BEV Perception
TRI-ML/DDAD
Dense Depth for Autonomous Driving (DDAD) dataset.
facebookresearch/msn
Masked Siamese Networks for Label-Efficient Learning (https://arxiv.org/abs/2204.07141)
autonomousvision/kitti360Scripts
This repository contains utility scripts for the KITTI-360 dataset.
hancyran/RepSurf
[CVPR 2022 Oral] Official implementation for "Surface Representation for Point Clouds"
TuSimple/centerformer
Implementation for CenterFormer: Center-based Transformer for 3D Object Detection (ECCV 2022)
Megvii-BaseDetection/BEVStereo
Official code for BEVStereo
Divadi/SOLOFusion
Time Will Tell: New Outlooks and A Baseline for Temporal Multi-View 3D Object Detection
lucidrains/glom-pytorch
An attempt at the implementation of Glom, Geoffrey Hinton's new idea that integrates concepts from neural fields, top-down-bottom-up processing, and attention (consensus between columns), for emergent part-whole heirarchies from data
AutoVision-cloud/SSL-Lanes
[CoRL-2022] SSL-Lanes: Self-Supervised Learning for Motion Forecasting in Autonomous Driving
yichen928/STRL
Code for the paper "Spatio-temporal Self-Supervised Representation Learning for 3D Point Clouds" (ICCV 2021)
Sense-X/AGVM
Large-batch Optimization for Dense Visual Predictions (NeurIPS 2022)
d1024choi/HLSTrajForecast
The official implementation of "Hierarchical Latent Structure for Multi-Modal Vehicle Trajectory Forecasting" presented in ECCV2022.
HFAiLab/hdmapnet
lzhbrian/Awesome-BEV-Papers