Pinned Repositories
char-diffusion
Character-level diffusion language model
cs224n
Solutions to CS224n: Natural Language Processing with Deep Learning assignments.
text-sed
Implementation of Self-conditioned Embedding Diffusion for Text Generation
trlx
A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)
jon-tow's Repositories
jon-tow/cs224n
Solutions to CS224n: Natural Language Processing with Deep Learning assignments.
jon-tow/trlx
A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)
jon-tow/gpt-neox
An implementation of model parallel autoregressive transformers on GPUs, based on the DeepSpeed library.
jon-tow/jon-tow.github.io
My personal website
jon-tow/bigcode-evaluation-harness
A framework for the evaluation of autoregressive code generation language models.
jon-tow/cc_net
Tools to download and cleanup Common Crawl data
jon-tow/contriever
Contriever: Unsupervised Dense Information Retrieval with Contrastive Learning
jon-tow/CPCargo
A simple package to upload DL checkpoints to remote storage
jon-tow/DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
jon-tow/dynamic-sparse-flash-attention
jon-tow/english-wordnet
The Open English WordNet
jon-tow/flash-attention
Fast and memory-efficient exact attention
jon-tow/goodreads
code samples for the goodreads datasets
jon-tow/helm
Holistic Evaluation of Language Models (HELM), a framework to increase the transparency of language models (https://arxiv.org/abs/2211.09110).
jon-tow/hf_transfer
jon-tow/megablocks
jon-tow/Megatron-LLM
distributed trainer for LLMs
jon-tow/mini-omni
open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.
jon-tow/ml-engineering
Machine Learning Engineering Guides and Tools
jon-tow/ok
Codex-based command line assistant
jon-tow/rerope
Rectified Rotary Position Embeddings
jon-tow/ring-flash-attention
Ring attention implementation with flash attention
jon-tow/scaled-rope
jon-tow/scattermoe
Triton-based implementation of Sparse Mixture of Experts.
jon-tow/text-dedup
All-in-one text de-duplication
jon-tow/torchtitan
A native PyTorch Library for large model training
jon-tow/transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
jon-tow/triton
Development repository for the Triton language and compiler
jon-tow/WaveCoder
Advancing LLM with Diverse Coding Capabilities
jon-tow/zero-bubble-pipeline-parallelism
Zero Bubble Pipeline Parallelism