dali92002's Stars
aiintelligentsystems/next-level-bert
emanuelevivoli/awesome-comics-understanding
The official repo of the Comics Survey: "A missing piece in Vision and Language: A Survey on Comics Understanding"
wangkai930418/awesome-diffusion-categorized
collection of diffusion model papers categorized by their subareas
arogozhnikov/einops
Flexible and powerful tensor operations for readable and reliable code (for pytorch, jax, TF and others)
FoundationVision/VAR
[NeurIPS 2024 Oral][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simple, user-friendly yet state-of-the-art* codebase for autoregressive image generation!
andreybarsky/annotation
annotation system for labelling bounding boxes using openCV
ayanban011/GraphKD
[ICDAR 2024] (Best Student Paper🏆) Exploring Knowledge Distillation Towards Document Object Detection with Structured Graph Creation
hecoding/Hyper-Modulation
Official Implementation for "Transferring Unconditional to Conditional GANs with Hyper-Modulation" CVPRW 22 https://arxiv.org/abs/2112.02219
microsoft/table-transformer
Table Transformer (TATR) is a deep learning model for extracting tables from unstructured documents (PDFs and images). This is also the official repository for the PubTables-1M dataset and GriTS evaluation metric.
voxel51/fiftyone
Refine high-quality datasets and visual AI models
jyf588/transformer-inertial-poser
Python implementation accompanying the Transformer Inertial Poser paper at SIGGRAPH Asia 2022
Xinyu-Yi/EgoLocate
A real-time system that simultaneously captures human pose, reconstructs the scene in sparse 3D points, and localizes the human in the scene with 6 IMUs and a body-worn phone camera
leitro/LabelAdaptiveMixup-SER
rubenpt91/PFL-DocVQA-Competition
eth-siplab/AvatarPoser
Official Code for ECCV 2022 paper "AvatarPoser: Articulated Full-Body Pose Tracking from Sparse Motion Sensing"
lllyasviel/ControlNet
Let us control diffusion models!
rossumai/docile
DocILE: Document Information Localization and Extraction Benchmark
lucidrains/denoising-diffusion-pytorch
Implementation of Denoising Diffusion Probabilistic Model in Pytorch
CompVis/stable-diffusion
A latent text-to-image diffusion model
weixi-feng/Structured-Diffusion-Guidance
Training-Free Structured Diffusion Guidance for Compositional Text-to-Image Synthesis
CompVis/latent-diffusion
High-Resolution Image Synthesis with Latent Diffusion Models
sjvasquez/handwriting-synthesis
Handwriting Synthesis with RNNs ✏️
andreagemelli/doc2graph
Doc2Graph transforms documents into graphs and exploit a GNN to solve several tasks.
furkanbiten/idl_data
OCR Annotations from Amazon Textract for Industry Documents Library
ivy-llc/ivy
Convert Machine Learning Code Between Frameworks
ayanban011/GACNN
Generative Adverserial Convolutional Neural Network
ayanban011/jNMF
Discovering De-similarities of Modular Structure Between Tumor Cells and Normal Cells by Integrating Multiple Data Sources Through Joint Non-Negative Matrix Factorization
ayanban011/dct-dft-fft-craft
DCT-DFT-FFT Based Method for Text Detection in Underwater Images
ayanban011/HAGNN
Gene Selection of Microarray Data using Heatmap Analysis and Graph Neural Network
ayanban011/Machine-Learning
In the summer 2020, I have get a chance to learn machine learning from Andrew Ng, coursework organised by Stanford University. Here, I am going to upload all the assignment done by me during the coursework.