dmizr's Stars
meta-llama/llama
Inference code for Llama models
karpathy/nanoGPT
The simplest, fastest repository for training/finetuning medium-sized GPTs.
karpathy/LLM101n
LLM101n: Let's build a Storyteller
google-research/tuning_playbook
A playbook for systematically maximizing the performance of deep learning models.
ml-explore/mlx
MLX: An array framework for Apple silicon
salesforce/LAVIS
LAVIS - A One-stop Library for Language-Vision Intelligence
facebookresearch/xformers
Hackable and optimized Transformers building blocks, supporting a composable construction.
LargeWorldModel/LWM
Large World Model -- Modeling Text and Video with Millions Context
google-research/arxiv-latex-cleaner
arXiv LaTeX Cleaner: Easily clean the LaTeX code of your paper to submit to arXiv
FoundationVision/VAR
[NeurIPS 2024 Oral][GPT beats diffusionš„] [scaling laws in visual generationš] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simple, user-friendly yet state-of-the-art* codebase for autoregressive image generation!
rom1504/img2dataset
Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.
google/prompt-to-prompt
google-research/t5x
OFA-Sys/OFA
Official repository of OFA (ICML 2022). Paper: OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework
apple/axlearn
An Extensible Deep Learning Library
kakaobrain/coyo-dataset
COYO-700M: Large-scale Image-Text Pair Dataset
mlfoundations/dclm
DataComp for Language Models
xl0/lovely-tensors
Tensors, for human consumption
apple/ml-aim
This repository provides the code and model checkpoints for AIMv1 and AIMv2 research projects.
LTH14/rcg
PyTorch implementation of RCG https://arxiv.org/abs/2312.03701
google/seqio
Task-based datasets, preprocessing, and evaluation for sequence models.
epfLLM/Megatron-LLM
distributed trainer for LLMs
penghao-wu/vstar
PyTorch Implementation of "V* : Guided Visual Search as a Core Mechanism in Multimodal LLMs"
google-deepmind/nanodo
MatX-inc/seqax
seqax = sequence modeling + JAX
graphcore-research/unit-scaling
A library for unit scaling in PyTorch
cloneofsimo/scaling-guide
WIP
cloneofsimo/min-fsdp
cloneofsimo/ezmup
Simple implementation of muP, based on Spectral Condition for Feature Learning. The implementation is SGD only, dont use it for Adam
liuxingbin/dbot
[ICLR2024] Exploring Target Representations for Masked Autoencoders