jin-s13's Stars
voxel51/fiftyone
Refine high-quality datasets and visual AI models
QwenLM/Qwen2.5
Qwen2.5 is the large language model series developed by Qwen team, Alibaba Cloud.
CVHub520/X-AnyLabeling
Effortless data labeling with AI support from Segment Anything and other awesome models.
chongzhou96/EdgeSAM
Official PyTorch implementation of "EdgeSAM: Prompt-In-the-Loop Distillation for On-Device Deployment of SAM"
wkentaro/labelme
Image Polygonal Annotation with Python (polygon, rectangle, circle, line, point and image-level flag annotation).
siyuanliii/masa
Official Implementation of CVPR24 highligt paper: Matching Anything by Segmenting Anything
InternLM/MindSearch
🔍 An LLM-based Multi-agent Framework of Web Search Engine (like Perplexity.ai Pro and SearchGPT)
facebookresearch/sam2
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
InternLM/lagent
A lightweight framework for building LLM-based agents
facebookresearch/MobileLLM
MobileLLM Optimizing Sub-billion Parameter Language Models for On-Device Use Cases. In ICML 2024.
LazyAGI/LazyLLM
Easiest and laziest way for building multi-agent LLMs applications.
jin-s13/UniFS
comfyanonymous/ComfyUI
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
infiniflow/ragflow
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
jin-s13/GKGNet
ECCV'2024 "GKGNet: Group K-Nearest Neighbor based Graph Convolutional Network for Multi-Label Image Recognition"
lishuhuai527/COCO-UniHuman
facebookresearch/eft
visualization code for 3D human body annotation by EFT (Exemplar Fine-tuning)
BubblyYi/MMPedestron
[ECCV2024] Official implementation of the paper "When Pedestrian Detection Meets Multi-Modal Learning: Generalist Model and Benchmark Dataset"
jin-s13/MMPD-Dataset
MMPD Dataset from ECCV'2024 "When Pedestrian Detection Meets Multi-Modal Learning: Generalist Model and Benchmark Dataset"
kennethwdk/LocLLM
Code for "LocLLM: Exploiting Generalizable Human Keypoint Localization via Large Language Model", CVPR 2024 Highlight
Ber666/ToolkenGPT
ToolkenGPT: Augmenting Frozen Language Models with Massive Tools via Tool Embeddings - NeurIPS 2023 (oral)
langgenius/dify
Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.
IDEA-Research/X-Pose
[ECCV 2024] Official implementation of the paper "X-Pose: Detecting Any Keypoints"
meta-llama/llama3
The official Meta Llama 3 GitHub site
InternLM/xtuner
An efficient, flexible and full-featured toolkit for fine-tuning LLM (InternLM2, Llama3, Phi3, Qwen, Mistral, ...)
jy0205/LaVIT
LaVIT: Empower the Large Language Model to Understand and Generate Visual Content
EvolvingLMMs-Lab/lmms-eval
Accelerating the development of large multimodal models (LMMs) with one-click evaluation module - lmms-eval.
open-compass/VLMEvalKit
Open-source evaluation toolkit of large vision-language models (LVLMs), support 160+ VLMs, 50+ benchmarks
Dao-AILab/flash-attention
Fast and memory-efficient exact attention
anishmadan23/foundational_fsod
This repository contains the implementation for the paper "Revisiting Few Shot Object Detection with Vision-Language Models"