zhaoshitian's Stars
facebookresearch/segment-anything
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
PKU-YuanGroup/Open-Sora-Plan
This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.
Stability-AI/StableCascade
Official Code for Stable Cascade
DepthAnything/Depth-Anything-V2
Depth Anything V2. A More Capable Foundation Model for Monocular Depth Estimation
lllyasviel/Paints-UNDO
Understand Human Behavior to Align True Needs
dvlab-research/MGM
Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"
lm-sys/RouteLLM
A framework for serving and evaluating LLM routers - save LLM costs without compromising quality!
AiuniAI/Unique3D
Official implementation of Unique3D: High-Quality and Efficient 3D Mesh Generation from a Single Image
lucidrains/vector-quantize-pytorch
Vector (and Scalar) Quantization, in Pytorch
thunlp/UltraChat
Large-scale, Informative, and Diverse Multi-round Chat Data (and Models)
AlibabaResearch/AdvancedLiterateMachinery
A collection of original, innovative ideas and algorithms towards Advanced Literate Machinery. This project is maintained by the OCR Team in the Language Technology Lab, Tongyi Lab, Alibaba Group.
open-compass/VLMEvalKit
Open-source evaluation toolkit of large vision-language models (LVLMs), support ~100 VLMs, 40+ benchmarks
GAIR-NLP/anole
Anole: An Open, Autoregressive and Native Multimodal Models for Interleaved Image-Text Generation
Alpha-VLLM/Lumina-mGPT
Official Implementation of "Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraining"
catcathh/UltraPixel
Implementation of UltraPixel: Advancing Ultra-High-Resolution Image Synthesis to New Peaks
google-research/maskgit
Official Jax Implementation of MaskGIT
huggingface/cosmopedia
dome272/MaskGIT-pytorch
Pytorch implementation of MaskGIT: Masked Generative Image Transformer (https://arxiv.org/pdf/2202.04200.pdf)
kyegomez/CM3Leon
An open source implementation of "Scaling Autoregressive Multi-Modal Models: Pretraining and Instruction Tuning", an all-new multi modal AI that uses just a decoder to generate both text and images
huggingface/open-muse
Open reproduction of MUSE for fast text2image generation.
frank-xwang/UnSAM
Code release for "Segment Anything without Supervision"
Jyouhou/UnrealText
Synthetic Scene Text from 3D Engines
zhuyr97/WGWS-Net
duchenzhuang/FSQ-pytorch
A Pytorch Implementation of Finite Scalar Quantization
CodeGoat24/DreamText
Official implementation of High Fidelity Scene Text Synthesis.
iiclab/DecompST
99Franklin/DiffText
agneet42/revision
[ECCV 2024] "REVISION: Rendering Tools Enable Spatial Fidelity in Vision-Language Models"
cloneofsimo/compare_aura_sd3
Vibe check Imagegen models (AuraFlow vs Others)