Pinned Repositories
CME213-LectureExamples
Short programs created in class and other examples
cub
CUB is a flexible library of cooperative threadblock primitives and other utilities for CUDA kernel programming.
docker.openmpi
A scalable OpenMPI runtime container for Docker
gemmlowp
Low-precision matrix multiplication
jax
Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more
Megatron-LM
Ongoing research training transformer models at scale
nervanagpu
Nervana GPU library
openai-gemm
Open single and half precision gemm implementations
slack-anonymous
Express your personal thoughts and desires in the depersonalized way concealing your true identity from the rest of your Slack team.
warp-ctc
Fast parallel CTC.
ekelsen's Repositories
ekelsen/warp-ctc
Fast parallel CTC.
ekelsen/CME213-LectureExamples
Short programs created in class and other examples
ekelsen/nervanagpu
Nervana GPU library
ekelsen/cub
CUB is a flexible library of cooperative threadblock primitives and other utilities for CUDA kernel programming.
ekelsen/docker.openmpi
A scalable OpenMPI runtime container for Docker
ekelsen/gemmlowp
Low-precision matrix multiplication
ekelsen/jax
Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more
ekelsen/Megatron-LM
Ongoing research training transformer models at scale
ekelsen/openai-gemm
Open single and half precision gemm implementations
ekelsen/slack-anonymous
Express your personal thoughts and desires in the depersonalized way concealing your true identity from the rest of your Slack team.
ekelsen/spf13-vim
The ultimate vim distribution