Fly-hub

Fly-hub's Stars

OpenGVLab/InternVL-MMDetSeg
Train InternViT-6B in MMSegmentation and MMDetection with DeepSpeed
Language:Jupyter Notebook736
Fridayfairy/MiniCPM-V-2_6-OD
Finetuning MiniCPM-V-2_6 for Object Detection Task
Language:Python83
ollama/ollama
Get up and running with Llama 3.3, Phi 4, Gemma 2, and other large language models.
Language:Go108k8.6k
QwenLM/Qwen-VL
The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.
Language:Python5.3k404
Stability-AI/generative-models
Generative Models by Stability AI
Language:Python25.1k2.8k
xx025/stable-video-diffusion-webui
stable-video-diffusion-webui, img to videos| 图片生成视频
Language:Python22135
JimmyLv/BibiGPT-v1
BibiGPT v1 · one-Click AI Summary for Audio/Video & Chat with Learning Content: Bilibili | YouTube | Tweet丨TikTok丨Dropbox丨Google Drive丨Local files | Websites丨Podcasts | Meetings | Lectures, etc. 音视频内容 AI 一键总结 & 对话：哔哩哔哩丨YouTube丨推特丨小红书丨抖音丨快手丨百度网盘丨阿里云盘丨网页丨播客丨会议丨本地文件等 (原 BiliGPT 省流神器 & AI课代表)
Language:TypeScript5.4k713
open-mmlab/mmdetection
OpenMMLab Detection Toolbox and Benchmark
Language:Python30k9.5k
Sense-X/Co-DETR
[ICCV 2023] DETRs with Collaborative Hybrid Assignments Training
Language:Python1.1k126
huggingface/pytorch-image-models
The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT), MobileNetV4, MobileNet-V3 & V2, RegNet, DPN, CSPNet, Swin Transformer, MaxViT, CoAtNet, ConvNeXt, and more
Language:Python32.9k4.8k
open-mmlab/playground
A central hub for gathering and showcasing amazing projects that extend OpenMMLab with SAM and other exciting features.
Language:Python1.2k125
IDEA-Research/Grounded-Segment-Anything
Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything
Language:Jupyter Notebook15.6k1.4k
Li-Qingyun/sam-mmrotate
SAM (Segment Anything Model) for generating rotated bounding boxes with MMRotate, which is a comparison method of H2RBox-v2.
Language:Python18214
liuyanyi/sam-with-mmdet
A simple demo for SAM+MMDetection
Language:Python474
lucidrains/denoising-diffusion-pytorch
Implementation of Denoising Diffusion Probabilistic Model in Pytorch
Language:Python8.7k1.1k
AUTOMATIC1111/stable-diffusion-webui
Stable Diffusion web UI
Language:Python146k27.4k
CompVis/stable-diffusion
A latent text-to-image diffusion model
Language:Jupyter Notebook69.1k10.3k
VITA-Group/DeblurGANv2
[ICCV 2019] "DeblurGAN-v2: Deblurring (Orders-of-Magnitude) Faster and Better" by Orest Kupyn, Tetiana Martyniuk, Junru Wu, Zhangyang Wang
Language:Python1.1k267
ljzycmd/SimDeblur
Simple framework for image and video deblurring, implemented by PyTorch
Language:Python31439
subeeshvasu/Awesome-Deblurring
A curated list of resources for Image and Video Deblurring
2.5k372
zzh-tech/BiT
[CVPR2023] Blur Interpolation Transformer for Real-World Motion from Blur
Language:Python2267
X-PLUG/mPLUG-Owl
mPLUG-Owl: The Powerful Multi-modal Large Language Model Family
Language:Python2.4k177
RiseInRose/MiniGPT-4-ZH
MiniGPT-4 中文部署翻译完善部署细节
Language:Python859101
Vision-CAIR/MiniGPT-4
Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)
Language:Python25.5k2.9k
huggingface/transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Language:Python138k27.6k
amazon-science/mm-cot
Official implementation for "Multimodal Chain-of-Thought Reasoning in Language Models" (stay tuned and more will be updated)
Language:Python3.9k317
dmlc/decord
An efficient video loader for deep learning with smart shuffling that's super easy to digest
Language:C++2k167
facebookresearch/fairseq
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
Language:Python30.8k6.4k
ArrowLuo/CLIP4Clip
An official implementation for "CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval"
Language:Python904125
AndreyGuzhov/AudioCLIP
Source code for models described in the paper "AudioCLIP: Extending CLIP to Image, Text and Audio" (https://arxiv.org/abs/2106.13043)
Language:Python78596