prefixRAINSTARsuffix's Stars
facebookresearch/detectron2
Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.
microsoft/unilm
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
HqWu-HITCS/Awesome-Chinese-LLM
整理开源的中文大语言模型,以规模较小、可私有化部署、训练成本较低的模型为主,包括底座模型,垂直领域微调及应用,数据集与教程等。
CompVis/latent-diffusion
High-Resolution Image Synthesis with Latent Diffusion Models
facebookresearch/dino
PyTorch code for Vision Transformers training with the Self-Supervised learning method DINO
CompVis/taming-transformers
Taming Transformers for High-Resolution Image Synthesis
mlfoundations/open_flamingo
An open-source framework for training large multimodal models.
NExT-GPT/NExT-GPT
Code and models for NExT-GPT: Any-to-Any Multimodal Large Language Model
mit-han-lab/llm-awq
[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
IDEA-Research/DINO
[ICLR 2023] Official implementation of the paper "DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection"
IST-DASLab/gptq
Code for the ICLR 2023 paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers".
Yuliang-Liu/Monkey
【CVPR 2024 Highlight】Monkey (LMM): Image Resolution and Text Label Are Important Things for Large Multi-modal Models
ytongbai/LVM
microsoft/i-Code
eric-ai-lab/MiniGPT-5
Official implementation of paper "MiniGPT-5: Interleaved Vision-and-Language Generation via Generative Vokens"
Yuliang-Liu/MultimodalOCR
On the Hidden Mystery of OCR in Large Multimodal Models (OCRBench)
wenhuchen/Table-Fact-Checking
Data and Code for ICLR2020 Paper "TabFact: A Large-scale Dataset for Table-based Fact Verification"
HaozheZhao/MIC
MMICL, a state-of-the-art VLM with the in context learning ability from ICL, PKU
HCIILAB/Scene-Text-Recognition-Recommendations
Papers, Datasets, Algorithms, SOTA for STR. Long-time Maintaining
SHI-Labs/Rethinking-Text-Segmentation
[CVPR 2021] Rethinking Text Segmentation: A Novel Dataset and A Text-Specific Refinement Approach
tingxueronghua/ChartLlama-code
ZhangYuanhan-AI/visual_prompt_retrieval
[NeurIPS2023] Official implementation and model release of the paper "What Makes Good Examples for Visual In-Context Learning?"
LukeForeverYoung/UReader
IST-DASLab/OBC
Code for the NeurIPS 2022 paper "Optimal Brain Compression: A Framework for Accurate Post-Training Quantization and Pruning".
lfy79001/TableQAKit
A Toolkit for Table-based Question Answering
Mountchicken/Text-Recognition-on-Cross-Domain-Datasets
Improved Text recognition algorithms on different text domains like scene text, handwritten, document, Chinese/English, even ancient books
MAEHCM/ICL-D3IE
Code for ICCV 2023 Paper : “ICL-D3IE: In-Context Learning with Diverse Demonstrations Updating for Document Information Extraction”
simplify23/MRN
Official Pytorch implementations of MRN: Multiplexed Routing Network for Incremental Multilingual Text Recognition (ICCV 2023)
yale-nlp/DocMath-Eval
zyuh/BDR-main