Markdown files serving as my blog.
Table of Contents (dates as yyyy-mm-dd):
-
On the Byte Latent Transformer
- 2024-12-31
-
Embeddings are in the middle of the model
- 2024-12-24
-
Merge tokens in autoregressive generation
- 2024-12-09
-
LLM Test Time Compute Scaling is model scaling
- 2024-12-05
- updated 2024-12-10
-
Tokenization and batch-norm: incorporating global statistics
- 2024-11-14
-
Multi-resolution VLMs for robotics
- 2024-11-09
-
Question: Does PEFT with SVD and full parameter finetuning work?
- 2024-11-08
-
Doing Pre-training Research on Instruction Models
- 2024-11-07
-
Mixture of Tokenizers (proposal)
- 2024-09-03
- updated 2024-09-14 (major re-write)
- updated 2024-09-16 (minor edits)