zhouyao4321

zhouyao4321's Stars

AUTOMATIC1111/stable-diffusion-webui
Stable Diffusion web UI
Language:Python141k 1.1k 7.6k26.6k
meta-llama/llama
Inference code for Llama models
Language:Python55.9k 522 9629.5k
facebookresearch/segment-anything
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
Language:Jupyter Notebook47k 305 6625.6k
huggingface/pytorch-image-models
The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT), MobileNetV4, MobileNet-V3 & V2, RegNet, DPN, CSPNet, Swin Transformer, MaxViT, CoAtNet, ConvNeXt, and more
Language:Python31.7k 311 9144.7k
facebookresearch/fairseq
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
Language:Python30.3k 428 4.2k6.4k
facebookresearch/detectron2
Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.
Language:Python30.2k 386 3.5k7.4k
hpcaitech/Open-Sora
Open-Sora: Democratizing Efficient Video Production for All
Language:Python21.8k 184 4902.1k
google-deepmind/deepmind-research
This repository contains implementations and illustrative code to accompany DeepMind publications
Language:Jupyter Notebook13.1k 325 3212.6k
Megvii-BaseDetection/YOLOX
YOLOX is a high-performance anchor-free YOLO, exceeding yolov3~v5 with MegEngine, ONNX, TensorRT, ncnn, and OpenVINO supported. Documentation: https://yolox.readthedocs.io/
Language:Python9.4k 76 1.5k2.2k
facebookresearch/mae
PyTorch implementation of MAE https//arxiv.org/abs/2111.06377
Language:Python7.2k 56 1911.2k
rom1504/img2dataset
Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.
Language:Python3.6k 31 255336
fundamentalvision/BEVFormer
[ECCV 2022] This is the official implementation of BEVFormer, a camera-only framework for autonomous driving perception, e.g., 3D object detection and semantic map segmentation.
Language:Python3.3k 69 263531
traveller59/spconv
Spatial Sparse Convolution Library
Language:Python1.9k 24 691362
SHI-Labs/Neighborhood-Attention-Transformer
Neighborhood Attention Transformer, arxiv 2022 / CVPR 2023. Dilated Neighborhood Attention Transformer, arxiv 2022
Language:Python1k 16 7785
gnobitab/RectifiedFlow
Official Implementation of Rectified Flow (ICLR2023 Spotlight)
Language:Python852 11 2252
chenhsuanlin/bundle-adjusting-NeRF
BARF: Bundle-Adjusting Neural Radiance Fields 🤮 (ICCV 2021 oral)
Language:Python786 12 83113
IDEA-Research/DN-DETR
[CVPR 2022 Oral] Official implementation of DN-DETR
Language:Python541 16 6762
ranandalon/mtl
Unofficial implementation of: Multi-task learning using uncertainty to weigh losses for scene geometry and semantics
Language:Python539 7 1578
bradyz/cross_view_transformers
Cross-view Transformers for real-time Map-view Semantic Segmentation (CVPR 2022 Oral)
Language:Python525 14 6180
Megvii-BaseDetection/DeFCN
End-to-End Object Detection with Fully Convolutional Network
Language:Python494 23 2336
TRI-ML/dd3d
Official PyTorch implementation of DD3D: Is Pseudo-Lidar needed for Monocular 3D Object detection? (ICCV 2021), Dennis Park*, Rares Ambrus*, Vitor Guizilini, Jie Li, and Adrien Gaidon.
Language:Python464 22 4874
Owen-Liuyuxuan/visualDet3D
Official Repo for Ground-aware Monocular 3D Object Detection for Autonomous Driving / YOLOStereo3D: A Step Back to 2D for Efficient Stereo 3D Detection
Language:Python365 7 8577
DrSleep/multi-task-refinenet
Multi-Task (Joint Segmentation / Depth / Surface Normas) Real-Time Light-Weight RefineNet
Language:Jupyter Notebook200 9 1145
TRI-ML/PF-Track
Implementation of PF-Track
Language:Python199 9 3226
kakaobrain/sparse-detr
PyTorch Implementation of Sparse DETR
Language:Python161 13 1115
kienduynguyen/BoxeR
Code release for "BoxeR: Box-Attention for 2D and 3D Transformers"
Language:Python138 4 2422
SuperMHP/GUPNet
Language:Python130 6 4722
Gorilla-Lab-SCUT/VISTA
This repo presents you the official code of "VISTA: Boosting 3D Object Detection via Dual Cross-VIew SpaTial Attention"
Language:Python127 2 2213
lucidrains/uniformer-pytorch
Implementation of Uniformer, a simple attention and 3d convolutional net that achieved SOTA in a number of video classification tasks, debuted in ICLR 2022
Language:Python97 6 34
SJSU-AD/FusionAD
An open source autonomous driving stack by San Jose State University Autonomous Driving Team
Language:C++42 9 14124