feymanpriv's Stars
PlexPt/awesome-chatgpt-prompts-zh
ChatGPT 中文调教指南。各种场景使用指南。学习怎么让它听你的话。
facebookresearch/segment-anything
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
chenfei-wu/TaskMatrix
Vision-CAIR/MiniGPT-4
Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)
microsoft/unilm
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
haotian-liu/LLaVA
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
amusi/CVPR2024-Papers-with-Code
CVPR 2024 论文和开源项目合集
IDEA-Research/Grounded-Segment-Anything
Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything
togethercomputer/OpenChatKit
salesforce/BLIP
PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
futantan/OpenGpt
Create your own ChatGPT App in seconds.
mlfoundations/open_flamingo
An open-source framework for training large multimodal models.
Luodian/Otter
🦦 Otter, a multi-modal model based on OpenFlamingo (open-sourced version of DeepMind's Flamingo), trained on MIMIC-IT and showcasing improved instruction-following and in-context learning ability.
baaivision/EVA
EVA Series: Visual Representation Fantasies from BAAI
visual-openllm/visual-openllm
something like visual-chatgpt, 文心一言的开源版
lucidrains/flamingo-pytorch
Implementation of 🦩 Flamingo, state-of-the-art few-shot visual question answering attention net out of Deepmind, in Pytorch
unit-mesh/unit-minions
《AI 研发提效:自己动手训练 LoRA》,包含 Llama (Alpaca LoRA)模型、ChatGLM (ChatGLM Tuning)相关 Lora 的训练。训练内容:用户故事生成、测试代码生成、代码辅助生成、文本转 SQL、文本生成代码……
tianrun-chen/SAM-Adapter-PyTorch
Adapting Meta AI's Segment Anything to Downstream Tasks with Adapters and Prompts
showlab/Image2Paragraph
[A toolbox for fun.] Transform Image into Unique Paragraph with ChatGPT, BLIP2, OFA, GRIT, Segment Anything, ControlNet.
Confusezius/Deep-Metric-Learning-Baselines
PyTorch Implementation for Deep Metric Learning Pipelines
showlab/VLog
Transform Video as a Document with ChatGPT, CLIP, BLIP2, GRIT, Whisper, LangChain.
OpenGVLab/UniFormerV2
[ICCV2023] UniFormerV2: Spatiotemporal Learning by Arming Image ViTs with Video UniFormer
whwu95/Cap4Video
【CVPR'2023 Highlight & TPAMI】Cap4Video: What Can Auxiliary Captions Do for Text-Video Retrieval?
deepglint/unicom
[ICLR 2023] Unicom: Universal and Compact Representation Learning for Image Retrieval
RupertLuo/Valley
The official repository of "Video assistant towards large language model makes everything easy"
xyzforever/BEVT
PyTorch implementation of BEVT (CVPR 2022) https://arxiv.org/abs/2112.01529
whwu95/BIKE
【CVPR'2023】Bidirectional Cross-Modal Knowledge Exploration for Video Recognition with Pre-trained Vision-Language Models
lucidrains/MaMMUT-pytorch
Implementation of MaMMUT, a simple vision-encoder text-decoder architecture for multimodal tasks from Google, in Pytorch
daniel-code/TubeViT
An unofficial implementation of TubeViT in "Rethinking Video ViTs: Sparse Video Tubes for Joint Image and Video Learning"
satojkovic/DeepLogo2
A brand logo detection system by DETR