Aph-xin

Aph-xin's Stars

d2l-ai/d2l-zh
《动手学深度学习》：面向中文读者、能运行、可讨论。中英文版被70多个国家的500多所大学用于教学。
Language:Python65.1k 1.1k 011.2k
infiniflow/ragflow
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
Language:Python28.4k 153 2.1k2.7k
qdrant/qdrant
Qdrant - High-performance, massive-scale Vector Database and Vector Search Engine for the next generation of AI. Also available in the cloud https://cloud.qdrant.io/
Language:Rust21.4k 129 1.4k1.5k
IDEA-Research/GroundingDINO
[ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"
Language:Python7.2k 45 311729
open-mmlab/mmcv
OpenMMLab Computer Vision Foundation
Language:Python6k 85 1.2k1.7k
lancedb/lancedb
Developer-friendly, serverless vector database for AI applications. Easily add long-term memory to your LLM apps!
Language:Python5.3k 33 819365
salesforce/BLIP
PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Language:Jupyter Notebook5k 33 202662
Lordog/dive-into-llms
《动手学大模型Dive into LLMs》系列编程实践教程
4.2k 25 9364
google-research/scenic
Scenic: A Jax Library for Computer Vision Research and Beyond
Language:Python3.4k 39 269440
mcordts/cityscapesScripts
README and scripts for the Cityscapes Dataset
Language:Python2.2k 44 140606
milvus-io/bootcamp
Dealing with all unstructured data, such as reverse image search, audio search, molecular search, video analysis, question and answer systems, NLP, etc.
Language:HTML2k 35 264602
FoundationVision/GLEE
[CVPR2024 Highlight]GLEE: General Object Foundation Model for Images and Videos at Scale
Language:Python1.1k 47 4986
ArrowLuo/CLIP4Clip
An official implementation for "CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval"
Language:Python904 13 110125
jianzongwu/Awesome-Open-Vocabulary
(TPAMI 2024) A Survey on Open Vocabulary Learning
875 26 1450
IDEA-Research/Grounding-DINO-1.5-API
Grounding DINO 1.5: IDEA Research's Most Capable Open-World Object Detection Model Series
Language:Python854 14 4829
MC-E/DragonDiffusion
ICLR 2024 (Spotlight)
Language:Python740 41 2921
limuloo/MIGC
[CVPR 2024 Highlight] MIGC and [TPAMI 2024] MIGC++ (Official Implementation)
Language:Python572 22 1528
frank-xwang/InstanceDiffusion
[CVPR 2024] Code release for "InstanceDiffusion: Instance-level Control for Image Generation"
Language:Python535 8 4229
Jingkang50/OpenPSG
Benchmarking Panoptic Scene Graph Generation (PSG), ECCV'22
Language:Python433 6 9569
LukasBommes/mv-extractor
Extract frames and motion vectors from H.264 and MPEG-4 encoded video.
Language:C315 5 4062
WongSaang/chatgpt-ui-server
A ChatGPT UI server based on the Django framework.
Language:Python308 7 25168
NVIDIA-AI-IOT/nanoowl
A project that optimizes OWL-ViT for real-time inference with NVIDIA TensorRT.
Language:Python289 6 3050
wjun0830/QD-DETR
Official pytorch repository for "QD-DETR : Query-Dependent Video Representation for Moment Retrieval and Highlight Detection" (CVPR 2023 Paper)
Language:Python218 4 4715
tsunghan-wu/SLD
🔥 [CVPR2024] Official implementation of "Self-correcting LLM-controlled Diffusion Models (SLD)
Language:Python162 3 58
favyen/miris
MIRIS: Fast Object Track Queries in Video
Language:Go17 1 78
zkx06111/ReDiffusion
Language:Python14 1 50
everest-project/everest
Top-K Deep Video Analytics: A Probabilistic Approach
Language:Python12 4 08
orm011/seesaw
(Research) interactive retrieval system+algorithms: find objects of interest within image databases with less human effort
Language:Jupyter Notebook6 4 02
uwdb/EQUI-VOCAL
EQUI-VOCAL: Synthesizing Queries for Compositional Video Events from Limited User Interactions
Language:Jupyter Notebook6 3 01
autodistill/autodistill-owlv2
OWLv2 base model for use with Autodistill.
Language:Python5 4 26