futureisatyourhand
I believe that one day I will succeed.
Institute of Computing Technology, Chinese Academy of SciencesBeiJing
futureisatyourhand's Stars
facebookresearch/segment-anything
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
PaddlePaddle/PaddleOCR
Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)
huggingface/pytorch-image-models
The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT), MobileNetV4, MobileNet-V3 & V2, RegNet, DPN, CSPNet, Swin Transformer, MaxViT, CoAtNet, ConvNeXt, and more
openai/gpt-2
Code for the paper "Language Models are Unsupervised Multitask Learners"
karpathy/minGPT
A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training
microsoft/LoRA
Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"
jacobgil/pytorch-grad-cam
Advanced AI Explainability for computer vision. Support for CNNs, Vision Transformers, Classification, Object detection, Segmentation, Image similarity and more.
salesforce/LAVIS
LAVIS - A One-stop Library for Language-Vision Intelligence
autogluon/autogluon
Fast and Accurate ML in 3 Lines of Code
QwenLM/Qwen-VL
The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.
salesforce/BLIP
PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
open-mmlab/mmocr
OpenMMLab Text Detection, Recognition and Understanding Toolbox
InternLM/InternLM-XComposer
InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output
baaivision/EVA
EVA Series: Visual Representation Fantasies from BAAI
salesforce/ALBEF
Code for ALBEF: a new vision-language pre-training method
facebookresearch/ConvNeXt-V2
Code release for ConvNeXt V2 model
AlibabaResearch/AdvancedLiterateMachinery
A collection of original, innovative ideas and algorithms towards Advanced Literate Machinery. This project is maintained by the OCR Team in the Language Technology Lab, Tongyi Lab, Alibaba Group.
minivision-ai/Silent-Face-Anti-Spoofing
静默活体检测(Silent-Face-Anti-Spoofing)
microsoft/RegionCLIP
[CVPR 2022] Official code for "RegionCLIP: Region-based Language-Image Pretraining"
weijiaheng/Advances-in-Label-Noise-Learning
A curated (most recent) list of resources for Learning with Noisy Labels
chunbolang/BAM
Official PyTorch Implementation of Learning What Not to Segment: A New Perspective on Few-Shot Segmentation (CVPR'22 Oral & TPAMI'23).
Jyouhou/UnrealText
Synthetic Scene Text from 3D Engines
wenwenyu/TCM
Turning a CLIP Model into a Scene Text Detector (CVPR2023) | Turning a CLIP Model into a Scene Text Spotter (TPAMI)
LAION-AI/scaling-laws-openclip
Reproducible scaling laws for contrastive language-image learning (https://arxiv.org/abs/2212.07143)
wangsr126/MAE-Lite
Official implement for ICML2023 paper: "A Closer Look at Self-Supervised Lightweight Vision Transformers"
zejiangh/MILAN
PyTorch implementation of the paper "MILAN: Masked Image Pretraining on Language Assisted Representation" https://arxiv.org/pdf/2208.06049.pdf.
bytedance/oclip
futureisatyourhand/atlas300_ascend310_fewshotdetection
futureisatyourhand/Top-Related-Meta-Learning-Method-for-Few-Shot-Detection
code about https://arxiv.org/pdf/2007.06837.pdf
futureisatyourhand/self-supervised-learning
about self-supervised image classification and object detection