1349949

1349949's Stars

openai/evals
Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.
Language:Python15.3k 263 2152.6k
mlfoundations/open_clip
An open source implementation of CLIP.
Language:Python10.7k 81 5051k
facebookresearch/DiT
Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"
Language:Python6.6k 44 83593
facebookresearch/Mask2Former
Code release for "Masked-attention Mask Transformer for Universal Image Segmentation"
Language:Python2.6k 28 238396
OpenGVLab/InternImage
[CVPR 2023 Highlight] InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions
Language:Python2.6k 36 268239
mit-han-lab/bevfusion
[ICRA'23] BEVFusion: Multi-Task Multi-Sensor Fusion with Unified Bird's-Eye View Representation
Language:Python2.4k 41 607437
baaivision/EVA
EVA Series: Visual Representation Fantasies from BAAI
Language:Python2.4k 30 163170
IDEA-Research/DINO
[ICLR 2023] Official implementation of the paper "DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection"
Language:Python2.3k 32 266265
ShoufaChen/DiffusionDet
[ICCV2023 Best Paper Finalist] PyTorch implementation of DiffusionDet (https://arxiv.org/abs/2211.09788)
Language:Python2.1k 17 115163
MasterBin-IIAU/UNINEXT
[CVPR'23] Universal Instance Perception as Object Discovery and Retrieval
Language:Python1.5k 101 57158
HuangJunJie2017/BEVDet
Code base of the BEVDet series .
Language:Python1.5k 35 363267
MCG-NJU/VideoMAE
[NeurIPS 2022 Spotlight] VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training
Language:Python1.4k 16 125137
chaytonmin/Awesome-BEV-Perception-Multi-Cameras
Awesome papers about Multi-Camera 3D Object Detection and Segmentation in Bird's-Eye-View, such as DETR3D, BEVDet, BEVFormer, BEVDepth, UniAD
1k 52 7109
megvii-research/PETR
[ECCV2022] PETR: Position Embedding Transformation for Multi-View 3D Object Detection & [ICCV2023] PETRv2: A Unified Framework for 3D Perception from Multi-Camera Images
Language:Python888 15 162134
Megvii-BaseDetection/BEVDepth
Official code for BEVDepth.
Language:Python741 15 167103
aharley/simple_bev
A Simple Baseline for BEV Perception
Language:Python514 8 6079
TRI-ML/DDAD
Dense Depth for Autonomous Driving (DDAD) dataset.
Language:Python498 34 3456
facebookresearch/msn
Masked Siamese Networks for Label-Efficient Learning (https://arxiv.org/abs/2204.07141)
Language:Python451 12 2433
autonomousvision/kitti360Scripts
This repository contains utility scripts for the KITTI-360 dataset.
Language:Python397 17 13263
hancyran/RepSurf
[CVPR 2022 Oral] Official implementation for "Surface Representation for Point Clouds"
Language:Python335 6 3525
TuSimple/centerformer
Implementation for CenterFormer: Center-based Transformer for 3D Object Detection (ECCV 2022)
Language:Python294 12 3928
Megvii-BaseDetection/BEVStereo
Official code for BEVStereo
Language:Python261 13 1415
Divadi/SOLOFusion
Time Will Tell: New Outlooks and A Baseline for Temporal Multi-View 3D Object Detection
Language:Python242 16 2613
lucidrains/glom-pytorch
An attempt at the implementation of Glom, Geoffrey Hinton's new idea that integrates concepts from neural fields, top-down-bottom-up processing, and attention (consensus between columns), for emergent part-whole heirarchies from data
Language:Python193 15 727
AutoVision-cloud/SSL-Lanes
[CoRL-2022] SSL-Lanes: Self-Supervised Learning for Motion Forecasting in Autonomous Driving
Language:Jupyter Notebook118 4 615
yichen928/STRL
Code for the paper "Spatio-temporal Self-Supervised Representation Learning for 3D Point Clouds" (ICCV 2021)
Language:Python74 7 98
Sense-X/AGVM
Large-batch Optimization for Dense Visual Predictions (NeurIPS 2022)
Language:Python56 3 23
d1024choi/HLSTrajForecast
The official implementation of "Hierarchical Latent Structure for Multi-Modal Vehicle Trajectory Forecasting" presented in ECCV2022.
Language:Python37 3 36
HFAiLab/hdmapnet
Language:Python36 3 66
lzhbrian/Awesome-BEV-Papers
20 2 01