EduardoPach's Stars
nvim-lua/kickstart.nvim
A launch point for your personal nvim configuration
huggingface/peft
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
speechbrain/speechbrain
A PyTorch-based Speech Toolkit
skypilot-org/skypilot
SkyPilot: Run AI and batch jobs on any infra (Kubernetes or 12+ clouds). Get unified execution, cost savings, and high GPU availability via a simple interface.
lucidrains/x-transformers
A concise but complete full-attention transformer with a set of promising experimental features from various papers
argmaxinc/WhisperKit
On-device Speech Recognition for Apple Silicon
SylphAI-Inc/AdalFlow
AdalFlow: The library to build & auto-optimize LLM applications.
facebookresearch/schedule_free
Schedule-Free Optimization in PyTorch
om-ai-lab/OmDet
Real-time and accurate open-vocabulary end-to-end object detection
XPixelGroup/HAT
CVPR2023 - Activating More Pixels in Image Super-Resolution Transformer Arxiv - HAT: Hybrid Attention Transformer for Image Restoration
showlab/Show-o
Repository for Show-o, One Single Transformer to Unify Multimodal Understanding and Generation.
FoundationVision/GLEE
[CVPR2024 Highlight]GLEE: General Object Foundation Model for Images and Videos at Scale
MarkMoHR/Awesome-Referring-Image-Segmentation
:books: A collection of papers about Referring Image Segmentation.
FoundationVision/Groma
[ECCV2024] Grounded Multimodal Large Language Model with Localized Visual Tokenization
alasdairforsythe/tokenmonster
Ungreedy subword tokenizer and vocabulary trainer for Python, Go & Javascript
argmaxinc/DiffusionKit
On-device Diffusion Models for Apple Silicon
shenyunhang/APE
[CVPR 2024] Aligning and Prompting Everything All at Once for Universal Visual Perception
OpenGVLab/all-seeing
[ICLR 2024 & ECCV 2024] The All-Seeing Projects: Towards Panoptic Visual Recognition&Understanding and General Relation Comprehension of the Open World"
EurekaLabsAI/tensor
The Tensor (or Array)
haochenheheda/segment-anything-annotator
We developed a python UI based on labelme and segment-anything for pixel-level annotation. It support multiple masks generation by SAM(box/point prompt), efficient polygon modification and category record. We will add more features (such as incorporating CLIP-based methods for category proposal and VOS methods for video datasets
Atten4Vis/LW-DETR
This repository is an official implementation of the paper "LW-DETR: A Transformer Replacement to YOLO for Real-Time Detection".
cartesia-ai/edge
On-device intelligence.
MaverickRen/PixelLM
PixelLM is an effective and efficient LMM for pixel-level reasoning and understanding. PixelLM is accepted by CVPR 2024.
argmaxinc/whisperkittools
Python tools for WhisperKit: Model conversion, optimization and evaluation
Surrey-UP-Lab/RegionSpot
Recognize Any Regions
google-research/semivl
[ECCV'24] Official Implementation of SemiVL: Semi-Supervised Semantic Segmentation with Vision-Language Guidance
EasonXiao-888/UVCOM
[CVPR 2024] Bridging the Gap: A Unified Video Comprehension Framework for Moment Retrieval and Highlight Detection
YBZh/DMN
CVPR2024: Dual Memory Networks: A Versatile Adaptation Approach for Vision-Language Models
Sssssuperior/VSCode
Code release for "VSCode: General Visual Salient and Camouflaged Object Detection with 2D Prompt Learning"
jianzongwu/robust-ref-seg
(TIP 2024) Towards Robust Referring Image Segmentation