aaberdam's Stars
BradyFU/Awesome-Multimodal-Large-Language-Models
:sparkles::sparkles:Latest Advances on Multimodal Large Language Models
voxel51/fiftyone
Refine high-quality datasets and visual AI models
clovaai/donut
Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022
baudm/parseq
Scene Text Recognition with Permuted Autoregressive Sequence Models (ECCV 2022)
microsoft/GenerativeImage2Text
GIT: A Generative Image-to-text Transformer for Vision and Language
utkuozbulak/pytorch-cnn-adversarial-attacks
Pytorch implementation of convolutional neural network adversarial attack techniques
ku21fan/STR-Fewer-Labels
Scene Text Recognition (STR) methods trained with fewer real labels (CVPR 2021)
shilomagen/passport-extension
amazon-science/glass-text-spotting
Official implementation for "GLASS: Global to Local Attention for Scene-Text Spotting" (ECCV'22)
furkanbiten/idl_data
OCR Annotations from Amazon Textract for Industry Documents Library
amazon-science/semimtr-text-recognition
Multimodal Semi-Supervised Learning for Text Recognition (SemiMTR)
phantrdat/cvpr20-scatter-text-recognizer
Unofficial implementation of CVPR 2020 paper "SCATTER: Selective Context Attentional Scene Text Recognizer"
GaryMataev/DeepRED
DeepRED: Deep Image Prior Powered by RED
jsulam/ml-ista
Demo for Multi-Layer ISTA and Multi-Layer FISTA algorithms for convolutional neural networks, as described in J. Sulam, A. Aberdam, A. Beck, M. Elad, (2018). On Multi-Layer Basis Pursuit, Efficient Algorithms and Convolutional Neural Networks. arXiv preprint:1806.00701
amazon-science/textadain-robust-recognition
TextAdaIN: Paying Attention to Shortcut Learning in Text Recognizers
furkanbiten/stvqa_amazon_ocr
STVQA and TextVQA OCR results from Amazon Text in Image pipeline