fuzzypants123's Stars
babysor/MockingBird
🚀AI拟声: 5秒内克隆您的声音并生成任意语音内容 Clone a voice in 5 seconds to generate arbitrary speech in real-time
coqui-ai/TTS
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
meta-llama/llama3
The official Meta Llama 3 GitHub site
microsoft/unilm
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
Delgan/loguru
Python logging made (stupidly) simple
WongKinYiu/yolov9
Implementation of paper - YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information
THUDM/CodeGeeX2
CodeGeeX2: A More Powerful Multilingual Code Generation Model
facebookresearch/dino
PyTorch code for Vision Transformers training with the Self-Supervised learning method DINO
levihsu/OOTDiffusion
Official implementation of OOTDiffusion: Outfitting Fusion based Latent Diffusion for Controllable Virtual Try-on
yformer/EfficientSAM
EfficientSAM: Leveraged Masked Image Pretraining for Efficient Segment Anything
hustvl/YOLOP
You Only Look Once for Panopitic Driving Perception.(MIR2022)
FoundationVision/GLEE
[CVPR2024 Highlight]GLEE: General Object Foundation Model for Images and Videos at Scale
sangyun884/HR-VITON
Official PyTorch implementation for the paper High-Resolution Virtual Try-On with Misalignment and Occlusion-Handled Conditions (ECCV 2022).
Skallwar/suckit
Suck the InTernet
chongzhou96/EdgeSAM
Official PyTorch implementation of "EdgeSAM: Prompt-In-the-Loop Distillation for On-Device Deployment of SAM"
AnythingInAnyScene/anything_in_anyscene
Executedone/Chinese-FastSpeech2
基于标贝数据继续训练,同时对原本的FastSpeech2模型做了改进,引入了韵律表征以及韵律预测模块,使中文发音更生动且富有节奏
SoccerNet/sn-gamestate
SoccerNet Game State Reconstruction: End-to-End Athlete Tracking and Identification on a Minimap (CVPR24 - CVSports workshop)
microsoft/SceneLandmarkLocalization
Source code and data for papers "Improved Scene Landmark Detection for Camera Localization" (3DV 2024) and "Learning to Detect Scene Landmarks for Camera Localization" (CVPR 2024).
Traffic-X/ViT-CoMer
Official implementation of the CVPR 2024 paper ViT-CoMer: Vision Transformer with Convolutional Multi-scale Feature Interaction for Dense Predictions.
sming256/OpenTAD
OpenTAD is an open-source temporal action detection (TAD) toolbox based on PyTorch.
ztsrxh/RoadBEV
Codes for RoadBEV: road surface reconstruction in Bird's Eye View
fudan-zvg/RoadNet
[ICCV2023 Oral] RoadNetworkTRansformer & [AAAI 2024] LaneGraph2Seq
MaySummerWind/DocHunt
🎉 汇聚并整理飞书等公开分享文档链接,解决没有官方全局搜索痛点,让知识持续传递。A list cool, beauty, interesting doc of feishu.
mengtan00/SA-BEV
This is the implementation of the paper "SA-BEV: Generating Semantic-Aware Bird's-Eye-View Feature for Multi-view 3D Object Detection" (ICCV 2023)
ChiShengChen/ResVMamba
The official repository implement of Res-VMamba: Fine-Grained Food Category Visual Classification Using Selective State Space Models with Deep Residual Learning
TUMFTM/GMMCalib
LiDAR-to-LiDAR Calibration
CaiYingFeng/VRSO
vincentqqb/PriorLane
SAIC-Vision/WS-3D-Lane
[ICRA 2023] WS-3D-Lane: Weakly Supervised 3D Lane Detection with 2D Lane Labels