Aph-xin's Stars
d2l-ai/d2l-zh
《动手学深度学习》:面向中文读者、能运行、可讨论。中英文版被70多个国家的500多所大学用于教学。
infiniflow/ragflow
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
qdrant/qdrant
Qdrant - High-performance, massive-scale Vector Database and Vector Search Engine for the next generation of AI. Also available in the cloud https://cloud.qdrant.io/
IDEA-Research/GroundingDINO
[ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"
open-mmlab/mmcv
OpenMMLab Computer Vision Foundation
lancedb/lancedb
Developer-friendly, serverless vector database for AI applications. Easily add long-term memory to your LLM apps!
salesforce/BLIP
PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Lordog/dive-into-llms
《动手学大模型Dive into LLMs》系列编程实践教程
google-research/scenic
Scenic: A Jax Library for Computer Vision Research and Beyond
mcordts/cityscapesScripts
README and scripts for the Cityscapes Dataset
milvus-io/bootcamp
Dealing with all unstructured data, such as reverse image search, audio search, molecular search, video analysis, question and answer systems, NLP, etc.
FoundationVision/GLEE
[CVPR2024 Highlight]GLEE: General Object Foundation Model for Images and Videos at Scale
ArrowLuo/CLIP4Clip
An official implementation for "CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval"
jianzongwu/Awesome-Open-Vocabulary
(TPAMI 2024) A Survey on Open Vocabulary Learning
IDEA-Research/Grounding-DINO-1.5-API
Grounding DINO 1.5: IDEA Research's Most Capable Open-World Object Detection Model Series
MC-E/DragonDiffusion
ICLR 2024 (Spotlight)
limuloo/MIGC
[CVPR 2024 Highlight] MIGC and [TPAMI 2024] MIGC++ (Official Implementation)
frank-xwang/InstanceDiffusion
[CVPR 2024] Code release for "InstanceDiffusion: Instance-level Control for Image Generation"
Jingkang50/OpenPSG
Benchmarking Panoptic Scene Graph Generation (PSG), ECCV'22
LukasBommes/mv-extractor
Extract frames and motion vectors from H.264 and MPEG-4 encoded video.
WongSaang/chatgpt-ui-server
A ChatGPT UI server based on the Django framework.
NVIDIA-AI-IOT/nanoowl
A project that optimizes OWL-ViT for real-time inference with NVIDIA TensorRT.
wjun0830/QD-DETR
Official pytorch repository for "QD-DETR : Query-Dependent Video Representation for Moment Retrieval and Highlight Detection" (CVPR 2023 Paper)
tsunghan-wu/SLD
🔥 [CVPR2024] Official implementation of "Self-correcting LLM-controlled Diffusion Models (SLD)
favyen/miris
MIRIS: Fast Object Track Queries in Video
zkx06111/ReDiffusion
everest-project/everest
Top-K Deep Video Analytics: A Probabilistic Approach
orm011/seesaw
(Research) interactive retrieval system+algorithms: find objects of interest within image databases with less human effort
uwdb/EQUI-VOCAL
EQUI-VOCAL: Synthesizing Queries for Compositional Video Events from Limited User Interactions
autodistill/autodistill-owlv2
OWLv2 base model for use with Autodistill.