NielsRogge
ML @HuggingFace. Interested in deep learning, NLP. Contributed 40+ models to HuggingFace Transformers
HuggingFaceBelgium
Pinned Repositories
awesome-huggingface
Repository containing awesome resources regarding Hugging Face tooling.
coco-eval
A tiny package supporting distributed computation of COCO metrics for PyTorch models.
CogVLM
a state-of-the-art-level open visual language model
Description2Process
Transforming textual descriptions into process models using deep learning
NielsRogge
Short README about myself.
tapas_utils
A package containing utils for the PyTorch version of the Tapas algorithm.
transformers
🤗Transformers: State-of-the-art Natural Language Processing for Pytorch and TensorFlow 2.0.
Transformers-Tutorials
This repository contains demos I made with the Transformers library by HuggingFace.
unilm
UniLM - Unified Language Model Pre-training / Pre-training for NLP and Beyond
Vision-Transformer-papers
This repository contains an overview of important follow-up works based on the original Vision Transformer (ViT) by Google.
NielsRogge's Repositories
NielsRogge/Transformers-Tutorials
This repository contains demos I made with the Transformers library by HuggingFace.
NielsRogge/transformers
🤗Transformers: State-of-the-art Natural Language Processing for Pytorch and TensorFlow 2.0.
NielsRogge/DocLayout-YOLO
DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse Synthetic Data and Global-to-Local Adaptive Perception
NielsRogge/huggingface.js
Utilities to use the Hugging Face Hub API
NielsRogge/segment-anything-2
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
NielsRogge/Show-o
Repository for Show-o, One Single Transformer to Unify Multimodal Understanding and Generation.
NielsRogge/clip_dinoiser
Official implementation of 'CLIP-DINOiser: Teaching CLIP a few DINO tricks' paper.
NielsRogge/GST
Official implementation of "GST: Precise 3D Human Body from a Single Image with Gaussian Splatting Transformers"
NielsRogge/Long-CLIP
[ECCV 2024] official code for "Long-CLIP: Unlocking the Long-Text Capability of CLIP"
NielsRogge/ml-veclip
The official repo for the paper "VeCLIP: Improving CLIP Training via Visual-enriched Captions"
NielsRogge/ultralytics
NEW - YOLOv8 🚀 in PyTorch > ONNX > OpenVINO > CoreML > TFLite
NielsRogge/AiM
Official PyTorch Implementation of "Scalable Autoregressive Image Generation with Mamba"
NielsRogge/Apollo
Music repair method to convert lossy MP3 compressed music to lossless music.
NielsRogge/chat-ui
Open source codebase powering the HuggingChat app
NielsRogge/CoMAE
[AAAI 2023 Oral] CoMAE: Single Model Hybrid Pre-training on Small-Scale RGB-D Datasets
NielsRogge/count_token_optimization
NielsRogge/CSD
NielsRogge/doubletake
[ECCV 2024] DoubleTake: Geometry Guided Depth Estimation
NielsRogge/EMA-VFI
[CVPR 2023] Extracting Motion and Appearance via Inter-Frame Attention for Efficient Video Frame Interpolatio
NielsRogge/FluxMusic
Text-to-Music Generation with Rectified Flow Transformers
NielsRogge/GenerateCT
ECCV 2024 & GenerateCT: Text-Conditional Generation of 3D Chest CT Volumes
NielsRogge/LightenDiffusion
Official pytorch implementation for "LightenDiffusion: Unsupervised Low-Light Image Enhancement with Latent-Retinex Diffusion Models"
NielsRogge/Lotus
Official Implementation of Lotus: Diffusion-based Visual Foundation Model for High-quality Dense Prediction
NielsRogge/mini-omni
open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.
NielsRogge/PGTFormer
[IJCAI'24] Beyond Alignment: Blind Video Face Restoration via Parsing-Guided Temporal-Coherent Transformer
NielsRogge/shic
Official implementation of the 2024 ECCV paper SHIC: Shape-Image Correspondences with no Keypoint Annotation
NielsRogge/sos-bench
This codebase stores the complete artifacts and describes how to reproduce or extend the results from the paper "Style over Substance: Failure modes of LLM judges in alignment benchmarking", including the MisMo-Bench meta-benchmark.
NielsRogge/StreamingT2V
StreamingT2V: Consistent, Dynamic, and Extendable Long Video Generation from Text
NielsRogge/unimatch
[TPAMI'23] Unifying Flow, Stereo and Depth Estimation
NielsRogge/VFIMamba
VFIMamba: Video Frame Interpolation with State Space Models