Ki-Zhang's Stars
openai/CLIP
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
lucidrains/vit-pytorch
Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch
infiniflow/ragflow
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
IDEA-Research/Grounded-Segment-Anything
Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything
labelmeai/labelme
Image Polygonal Annotation with Python (polygon, rectangle, circle, line, point and image-level flag annotation).
HumanAIGC/EMO
Emote Portrait Alive: Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions
mamba-org/mamba
The Fast Cross-Platform Package Manager
PyQt5/PyQt
PyQt Examples(PyQt各种测试和例子) PyQt4 PyQt5
pengsida/learning_research
本人的科研经验
victoresque/pytorch-template
PyTorch deep learning projects made easy.
dvlab-research/MGM
Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"
bowang-lab/MedSAM
Segment Anything in Medical Images
IDEA-Research/DINO
[ICLR 2023] Official implementation of the paper "DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection"
yformer/EfficientSAM
EfficientSAM: Leveraged Masked Image Pretraining for Efficient Segment Anything
botuniverse/onebot
OneBot:统一的聊天机器人应用接口标准
FoundationVision/GLEE
[CVPR2024 Highlight]GLEE: General Object Foundation Model for Images and Videos at Scale
InsightSoftwareConsortium/SimpleITK-Notebooks
Jupyter notebooks for learning how to use SimpleITK
bowang-lab/U-Mamba
U-Mamba: Enhancing Long-range Dependency for Biomedical Image Segmentation
radarFudan/Awesome-state-space-models
Collection of papers on state-space models
Curt-Park/segment-anything-with-clip
Segment Anything combined with CLIP
kyegomez/NaViT
My implementation of "Patch n’ Pack: NaViT, a Vision Transformer for any Aspect Ratio and Resolution"
sail-sg/ptp
[CVPR2023] The code for 《Position-guided Text Prompt for Vision-Language Pre-training》
nowsyn/InstMatt
Official repository for Instance Human Matting via Mutual Guidance and Multi-Instance Refinement
nobodyplayer1/VM-UNetV2
wenyalintw/Dicom-Viewer
An application displaying 2D/3D Dicom
wenzhengzeng/MPEblink
[CVPR 2023] Real-time Multi-person Eyeblink Detection in the Wild for Untrimmed Video
hustvl/ViTGaze
yuechuanlin-cw/PyOCT
Image reconstruction and data processing for spectral-domain optical coherence tomography
TomographicImaging/iDVC
Digital Volume Correlation user interface
Baron-sanmen/CrossGLG
The code for "CrossGLG: LLM Guides One-shot Skeleton-based 3D Action Recognition in a Cross-level Manner"