Pinned Repositories
automatic-circuits
block-recurrent-transformer
Pytorch implementation of "Block Recurrent Transformers" (Hutchins & Schlag et al., 2022)
dashstander.github.io
A beautiful, simple, clean, and responsive Jekyll theme for academics
LibCCNs
Covariant Compositional Networks Library is an easy-to-use and efficient implementation of Covariant Compositional Networks (CCNs) with TensorFlow and PyTorch's APIs based on a shared common C++ core.
lru-jax
monoid-representations
pushshiftr
An R wrapper for the pushshift.io API
sn-grok
wikidata
Rust library for reading data from Wikidata. Very WIP for now.
gpt-neox
An implementation of model parallel autoregressive transformers on GPUs, based on the Megatron and DeepSpeed libraries
dashstander's Repositories
dashstander/sn-grok
dashstander/automatic-circuits
dashstander/monoid-representations
dashstander/automata-rs
dashstander/Bayesian-Flow-Networks
A simple implimentation of Bayesian Flow Networks
dashstander/BGHT
BGHT: High-performance static GPU hash tables.
dashstander/causal-conv1d
Causal depthwise conv1d in CUDA, with a PyTorch interface
dashstander/DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
dashstander/devinterp
Quantifying degeneracy in toy models
dashstander/devinterp-automata
MATS 5.0 project for developmental interpretability stream
dashstander/extalg-torch
Exterior Algebra Utilities in PyTorch
dashstander/InterchangeInterventions
dashstander/levanter
Legibile, Scalable, Reproducible Foundation Models with Named Tensors and Jax
dashstander/mamba
dashstander/mamba-minimal
Simple, minimal implementation of the Mamba SSM in one file of PyTorch.
dashstander/mamba_interp
Various stuff in progress for mamba interpretability
dashstander/manim-permutations
dashstander/Megatron-DeepSpeed
Ongoing research training transformer language models at scale, including: BERT & GPT-2
dashstander/ml-sigma-reparam
dashstander/mlp-maps
dashstander/ngram-markov
dashstander/raspy
An interactive exploration of Transformer programming.
dashstander/relu-hyperplane-arrangement
dashstander/relu_edge_subdivision
Code for the ICML 2023 paper Polyhedral Complex Extraction from ReLU Networks using Edge Subdivision.
dashstander/Score-Entropy-Discrete-Diffusion
Discrete Diffusion Modeling by Estimating the Ratios of the Data Distribution (https://arxiv.org/abs/2310.16834)
dashstander/sn-fft
dashstander/superposition
dashstander/torch-queue
dashstander/transformer-arithmetic
dashstander/transformer-maps