vistapath-dan's Stars
computationalpathologygroup/ASAP
Program for the analysis and visualization of whole-slide images in digital pathology
facebookresearch/sam2
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
OptimalScale/LMFlow
An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Models for All.
ChongQingNoSubway/DGR-MIL
Code for paper: DGR-MIL: Exploring Diverse Global Representation in Multiple Instance Learning for Whole Slide Image Classification [ECCV 2024]
facebookresearch/sapiens
High-resolution models for human tasks.
google-research/maxim
[CVPR 2022 Oral] Official repository for "MAXIM: Multi-Axis MLP for Image Processing". SOTA for denoising, deblurring, deraining, dehazing, and enhancement.
HVision-NKU/StoryDiffusion
Accepted as [NeurIPS 2024] Spotlight Presentation Paper
bytedance/MoMA
MoMA: Multimodal LLM Adapter for Fast Personalized Image Generation
TempleX98/MoVA
[NeurIPS 2024] MoVA: Adapting Mixture of Vision Experts to Multimodal Context
FoundationVision/Groma
[ECCV2024] Grounded Multimodal Large Language Model with Localized Visual Tokenization
FoundationVision/VAR
[NeurIPS 2024 Oral][GPT beats diffusionš„] [scaling laws in visual generationš] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simple, user-friendly yet state-of-the-art* codebase for autoregressive image generation!
thunlp/LLaVA-UHD
LLaVA-UHD v2: an MLLM Integrating High-Resolution Feature Pyramid via Hierarchical Window Transformer
GraphPKU/PiSSA
PiSSA: Principal Singular Values and Singular Vectors Adaptation of Large Language Models(NeurIPS 2024 Spotlight)
instantX-research/InstantStyle
InstantStyle: Free Lunch towards Style-Preserving in Text-to-Image Generation š„
Czm369/MixPL
Mixed Pseudo Labels for Semi-Supervised Object Detection
mhamilton723/FeatUp
Official code for "FeatUp: A Model-Agnostic Frameworkfor Features at Any Resolution" ICLR 2024
microsoft/Cream
This is a collection of our NAS and Vision Transformer work.
Coloquinte/torchSR
Super Resolution datasets and models in Pytorch
apple/ml-flair
A large labelled image dataset for benchmarking in federated learning
apple/ml-mobileclip
This repository contains the official implementation of the research paper, "MobileCLIP: Fast Image-Text Models through Multi-Modal Reinforced Training" CVPR 2024
Vchitect/Latte
Latte: Latent Diffusion Transformer for Video Generation.
lmmlzn/Awesome-LLMs-Datasets
Summarize existing representative LLMs text datasets.
PrefectHQ/prefect
Prefect is a workflow orchestration framework for building resilient data pipelines in Python.
ollama/ollama
Get up and running with Llama 3.3, Mistral, Gemma 2, and other large language models.
huggingface/diffusers
š¤ Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch and FLAX.
microsoft/TransformerCompression
For releasing code related to compression methods for transformers, accompanying our publications
khanrc/honeybee
Official implementation of project Honeybee (CVPR 2024)
lxtGH/OMG-Seg
OMG-LLaVA and OMG-Seg codebase [CVPR-24 and NeurIPS-24]
TencentARC/PhotoMaker
PhotoMaker [CVPR 2024]
zhuyiche/llava-phi