/Hands-on-MLLM

Multimodal LLM gadgets implementation

Primary LanguageJupyter Notebook

Hands-on-MLLM

实现MLLM各种基础组件,目前集中于Transformer和Visual Encodec

TODO

  • Transformer(Training task: Sequence repeater)
  • BERT
  • GPT
  • ViT
  • BLIP-2 (Q-Former)
  • DiT
  • VQ-VAE
  • ...