kabicm
A PhD Student in CS at ETH Zürich. Previously, a software engineer at Swiss National Supercomputing Centre. Enthusiastic about Databases, Cloud-Computing, HPC.
ETH ZurichETH Zürich
Pinned Repositories
arbor
The Arbor multi-compartment neural network simulation library.
COSMA
Distributed Communication-Optimal Matrix-Matrix Multiplication Algorithm
COSTA
Distributed Communication-Optimal Shuffle and Transpose Algorithm
Tiled-MM
Matrix multiplication on GPUs for matrices stored on a CPU. Similar to cublasXt, but ported to both NVIDIA and AMD GPUs.
cp2k
Quantum chemistry and solid state physics software package
grid2grid
A library transforming between two arbitrary grid-like matrix data layouts over MPI ranks.
lu
LU-factorization with Scalapack
kabicm's Repositories
kabicm/alpa
Auto parallelization for large-scale neural networks
kabicm/apex
A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch
kabicm/attention-is-all-you-need-pytorch
A PyTorch implementation of the Transformer model in "Attention is All You Need".
kabicm/ColossalAI
Colossal-AI: A Unified Deep Learning System for Big Model Era
kabicm/conflux
Distributed Communication-Optimal LU-factorization Algorithm
kabicm/COSTA
Distributed Communication-Optimal Shuffle and Transpose Algorithm
kabicm/cuCollections
kabicm/cudf
cuDF - GPU DataFrame Library
kabicm/DFI-public
kabicm/DT-FM
kabicm/FasterTransformer
Transformer related optimization, including BERT, GPT
kabicm/FBGEMM
FB (Facebook) + GEMM (General Matrix-Matrix Multiplication) - https://code.fb.com/ml-applications/fbgemm/
kabicm/flash-attention
Fast and memory-efficient exact attention
kabicm/flax
Flax is a neural network library for JAX that is designed for flexibility.
kabicm/gavel
Code for "Heterogenity-Aware Cluster Scheduling Policies for Deep Learning Workloads", which appeared at OSDI 2020
kabicm/google-research
Google Research
kabicm/marius
Large scale embeddings on a single machine.
kabicm/mesh
Mesh TensorFlow: Model Parallelism Made Easier
kabicm/mesh-transformer-jax
Model parallel transformers in JAX and Haiku
kabicm/minGPT
A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training
kabicm/parallelformers
Parallelformers: An Efficient Model Parallelization Toolkit for Deployment
kabicm/pytorch3d
PyTorch3D is FAIR's library of reusable components for deep learning with 3D data
kabicm/query-engine
LingoDB: A new analytical database system that blurs the lines between databases and compilers.
kabicm/semiprof
Simple thread safe annotation based C++ profiler.
kabicm/snn_toolbox
Toolbox for converting analog to spiking neural networks (ANN to SNN), and running them in a spiking neuron simulator.
kabicm/spack
A flexible package manager that supports multiple versions, configurations, platforms, and compilers.
kabicm/sql-parser
SQL Parser for C++. Building C++ object structure from SQL statements.
kabicm/transformer-from-scratch
Well documented, unit tested, type checked and formatted implementation of a vanilla transformer - for educational purposes.
kabicm/transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
kabicm/trax
Trax — Deep Learning with Clear Code and Speed