yyyouy's Stars
karpathy/nanoGPT
The simplest, fastest repository for training/finetuning medium-sized GPTs.
tatsu-lab/stanford_alpaca
Code and documentation to train Stanford's Alpaca models, and generate the data.
datawhalechina/self-llm
《开源大模型食用指南》针对**宝宝量身打造的基于Linux环境快速微调(全参数/Lora)、部署国内外开源大模型(LLM)/多模态大模型(MLLM)教程
NielsRogge/Transformers-Tutorials
This repository contains demos I made with the Transformers library by HuggingFace.
yangjianxin1/Firefly
Firefly: 大模型训练工具,支持训练Qwen2.5、Qwen2、Yi1.5、Phi-3、Llama3、Gemma、MiniCPM、Yi、Deepseek、Orion、Xverse、Mixtral-8x7B、Zephyr、Mistral、Baichuan2、Llma2、Llama、Qwen、Baichuan、ChatGLM2、InternLM、Ziya2、Vicuna、Bloom等大模型
lllyasviel/Paints-UNDO
Understand Human Behavior to Align True Needs
LLaVA-VL/LLaVA-NeXT
yuanzhoulvpi2017/zero_nlp
中文nlp解决方案(大模型、数据、模型、训练、推理)
google-research/big_vision
Official codebase used to develop Vision Transformer, SigLIP, MLP-Mixer, LiT and more.
nicepkg/aide
Conquer Any Code in VSCode: One-Click Comments, Conversions, UI-to-Code, and AI Batch Processing of Files! 在 VSCode 中征服任何代码:一键注释、转换、UI 图生成代码、AI 批量处理文件!💪
allenai/open-instruct
cambrian-mllm/cambrian
Cambrian-1 is a family of multimodal LLMs with a vision-centric design.
LTH14/mar
PyTorch implementation of MAR+DiffLoss https://arxiv.org/abs/2406.11838
NVlabs/MambaVision
Official PyTorch Implementation of MambaVision: A Hybrid Mamba-Transformer Vision Backbone
TencentARC/Open-MAGVIT2
Open-MAGVIT2: Democratizing Autoregressive Visual Generation
TinyLLaVA/TinyLLaVA_Factory
A Framework of Small-scale Large Multimodal Models
bytedance/1d-tokenizer
This repo contains the code for 1D tokenizer and generator
thu-ml/CRM
[ECCV 2024] Single Image to 3D Textured Mesh in 10 seconds with Convolutional Reconstruction Model.
AILab-CVC/SEED
Official implementation of SEED-LLaMA (ICLR 2024).
EugenHotaj/pytorch-generative
Easy generative modeling in PyTorch
lucidrains/autoregressive-diffusion-pytorch
Implementation of Autoregressive Diffusion in Pytorch
thu-ml/cond-image-leakage
Official implementation for "Identifying and Solving Conditional Image Leakage in Image-to-Video Diffusion Model" (NeurIPS 2024)
ShareGPT4Omni/ShareGPT4V
[ECCV 2024] ShareGPT4V: Improving Large Multi-modal Models with Better Captions
valeoai/Maskgit-pytorch
unofficial MaskGIT reproduction in PyTorch
bytedance/tarsier
Tarsier -- a family of large-scale video-language models, which is designed to generate high-quality video descriptions , together with good capability of general video understanding.
chairc/Integrated-Design-Diffusion-Model
IDDM (Industrial, landscape, animate, spectrogram...), support DDPM, DDIM, PLMS, webui and distributed training. Pytorch实现扩散模型,生成模型,分布式训练
lucidrains/titok-pytorch
Implementation of TiTok, proposed by Bytedance in "An Image is Worth 32 Tokens for Reconstruction and Generation"
lucasjinreal/LLaVA-Magvit2
lucasjinreal/ImageTokenizer
imagetokenizer is a python package, helps you encoder visuals and generate visuals token ids from codebook, supports both image and video.
SmartFlowAI/LLM-Tutorial