FishAndWasabi's Stars
jbwang1997/OPUS
[Neurips 2024] OPUS: Occupancy Prediction Using a Sparse Set
jingyaogong/minimind
【大模型】3小时完全从0训练一个仅有26M的小参数GPT,最低仅需2G显卡即可推理训练!
Becomebright/GroundVQA
Official PyTorch code of "Grounded Question-Answering in Long Egocentric Videos", accepted by CVPR 2024.
hzwer/WritingAIPaper
Writing AI Conference Papers: A Handbook for Beginners
hassony2/useful-computer-vision-phd-resources
Lists of resources useful for my PhD in computer vision
fiveai/MoCaE
The official implementation of "MoCaE: Mixture of Calibrated Experts Significantly Improves Accuracy in Object Detection"
yongliu20/SCAN
[CVPR 2024] The repository contains the official implementation of "Open-Vocabulary Segmentation with Semantic-Assisted Calibration"
yzslab/gaussian-splatting-lightning
A 3D Gaussian Splatting framework with various derived algorithms and an interactive web viewer
caojiaolong/RGBDBenchmark
This repository contains various RGBD models and aims to provide a benchmark for evaluating their FLOPs, MACs, and the number of parameters. We will continue to add more functionalities in the future
pytorch/torchtune
A Native-PyTorch Library for LLM Fine-tuning
QwenLM/Qwen2-VL
Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
mc-lan/ProxyCLIP
[ECCV2024] ProxyCLIP: Proxy Attention Improves CLIP for Open-Vocabulary Segmentation
yixuan730/DetToolChain
Dettoolchain: A new prompting paradigm to unleash detection ability of MLLM
baaivision/EVA
EVA Series: Visual Representation Fantasies from BAAI
mims-harvard/UniTS
A unified multi-task time series model.
baaivision/DIVA
Diffusion Feedback Helps CLIP See Better
LisaAnne/Hallucination
facebookresearch/sam2
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
NK-JittorCV/nk-diffusion
zhengyuan-xie/ECCV24_NeST
EleutherAI/gpt-neox
An implementation of model parallel autoregressive transformers on GPUs, based on the Megatron and DeepSpeed libraries
Atten4Vis/LW-DETR
This repository is an official implementation of the paper "LW-DETR: A Transformer Replacement to YOLO for Real-Time Detection".
mc-lan/ClearCLIP
[ECCV2024] ClearCLIP: Decomposing CLIP Representations for Dense Vision-Language Inference
nku-zhichengzhang/ExtDM
[CVPR 2024] This is the official implementation of "ExtDM: Distribution Extrapolation Diffusion Model for Video Prediction"
hhaAndroid/llama3
The official Meta Llama 3 GitHub site
facebookresearch/DiT
Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"
yang-0201/MAF-YOLO
Implementation of paper - Multi-Branch Auxiliary Fusion YOLO with Re-parameterization Heterogeneous Convolutional for accurate object detection.
jbwang1997/StabilityIndex
[ECCV 2024] Towards Stable 3D Object Detection
Luo-Z13/pointobb
[CVPR2024] PointOBB: Learning Oriented Object Detection via Single Point Supervision
TencentARC/Open-MAGVIT2
Open-MAGVIT2: Democratizing Autoregressive Visual Generation