Pinned Repositories
awesome-detection-transformer
Collect some papers about transformer for detection and segmentation. Awesome Detection Transformer for Computer Vision (CV)
awesome-multiple-object-tracking
Resources for Multiple Object Tracking (MOT)
awesome-open-vocabulary-object-detection
Awesome-Token-Compress
A paper list of some recent works about Token Compress for Vit and VLM
cambrian
Cambrian-1 is a family of multimodal LLMs with a vision-centric design.
CDN
Code for "Mining the Benefits of Two-stage and One-stage HOI Detection"
deeplearning_ai_books
deeplearning.ai(吴恩达老师的深度学习课程笔记及资源)
NJU-Big-Data
Course Repo for Big Data Processing: Comprehensive Experiments
The-Phoenix-Proiect
凤凰项目: 一个 IT运维的传奇故事
p-MoD
p-MoD: Building Mixture-of-Depths MLLMs via Progressive Ratio Decay
jungle-gym-ac's Repositories
jungle-gym-ac/NJU-Big-Data
Course Repo for Big Data Processing: Comprehensive Experiments
jungle-gym-ac/awesome-detection-transformer
Collect some papers about transformer for detection and segmentation. Awesome Detection Transformer for Computer Vision (CV)
jungle-gym-ac/awesome-multiple-object-tracking
Resources for Multiple Object Tracking (MOT)
jungle-gym-ac/Awesome-Token-Compress
A paper list of some recent works about Token Compress for Vit and VLM
jungle-gym-ac/cambrian
Cambrian-1 is a family of multimodal LLMs with a vision-centric design.
jungle-gym-ac/CDN
Code for "Mining the Benefits of Two-stage and One-stage HOI Detection"
jungle-gym-ac/ChatGPT-Next-Web
A cross-platform ChatGPT/Gemini UI (Web / PWA / Linux / Win / MacOS). 一键拥有你自己的跨平台 ChatGPT/Gemini 应用。
jungle-gym-ac/detr
End-to-End Object Detection with Transformers
jungle-gym-ac/NJUCS-Courses
Course Materials from NJUCS
jungle-gym-ac/chameleon
Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR.
jungle-gym-ac/copilot-gpt4-service
Convert Github Copilot to ChatGPT, free to use the GPT-4 model
jungle-gym-ac/DeepStack-VL
jungle-gym-ac/FastV
Code for paper: An Image is Worth 1/2 Tokens After Layer 2: Plug-and-Play Inference Acceleration for Large Vision-Language Models
jungle-gym-ac/FlexAttention
jungle-gym-ac/HiRED
An early token dropping algorithm to improve inference efficiency for Vision-Lanauge Models with high-resolution images under resource constraints.
jungle-gym-ac/HOI-Learning-List
A list of Human-Object Interaction Learning.
jungle-gym-ac/HOI-Transformer
HOI Detection Transformer Architecture, Based on CVPR2021 paper "QPIC: Query-Based Pairwise Human-Object Interaction Detection with Image-Wide Contextual Information"
jungle-gym-ac/InternVideo
Video Foundation Models & Data for Multimodal Understanding
jungle-gym-ac/InternVL
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
jungle-gym-ac/Linux-Config
My Linux Configuration Scripts, Oh-My-Zsh, etc.
jungle-gym-ac/LLaVA
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
jungle-gym-ac/LLaVA-NeXT
jungle-gym-ac/lmms-eval
Accelerating the development of large multimodal models (LMMs) with lmms-eval
jungle-gym-ac/mPLUG-Owl
mPLUG-Owl: The Powerful Multi-modal Large Language Model Family
jungle-gym-ac/NJU-DisSys-Go-RPC
RPC Distributed System implemented in GO
jungle-gym-ac/Open-LLaVA-NeXT
An open-source implementation of LLaVA-NeXT.
jungle-gym-ac/p-MoD
p-MoD: Building Mixture-of-Depths MLLMs via Progressive Ratio Decay
jungle-gym-ac/vstar
PyTorch Implementation of "V* : Guided Visual Search as a Core Mechanism in Multimodal LLMs"
jungle-gym-ac/webvid
Large-scale text-video dataset. 10 million captioned short videos.
jungle-gym-ac/zotero-bridge
Obsidian plugin to integrate with Zotero through ZotServer