i404788/DPFP-pytorch
Implementation of Deterministic Parameter-Free Projection (DPFP) from the paper "Linear Transformers Are Secretly Fast Weight Memory Systems"
CudaBSD-2-Clause
Implementation of Deterministic Parameter-Free Projection (DPFP) from the paper "Linear Transformers Are Secretly Fast Weight Memory Systems"
CudaBSD-2-Clause