YashasSamaga's Stars
codecrafters-io/build-your-own-x
Master programming by recreating your favorite technologies from scratch.
pbatard/rufus
The Reliable USB Formatting Utility
tinygrad/tinygrad
You like pytorch? You like micrograd? You love tinygrad! ❤️
onnx/onnx
Open standard for machine learning interoperability
tensorflow/tensor2tensor
Library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research.
vdumoulin/conv_arithmetic
A technical report on convolution arithmetic in the context of deep learning
arogozhnikov/einops
Flexible and powerful tensor operations for readable and reliable code (for pytorch, jax, TF and others)
NVIDIA/cutlass
CUDA Templates for Linear Algebra Subroutines
Neargye/magic_enum
Static reflection for enums (to string, from string, iteration) for modern C++, work with any enum type without any macro or boilerplate code
llvm-mirror/llvm
Project moved to: https://github.com/llvm/llvm-project
facebookincubator/AITemplate
AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (NVIDIA GPU) and MatrixCore (AMD GPU) inference.
oneapi-src/oneDNN
oneAPI Deep Neural Network Library (oneDNN)
clab/dynet
DyNet: The Dynamic Neural Network Toolkit
hanickadot/compile-time-regular-expressions
Compile Time Regular Expression in C++
llvm-mirror/clang
Mirror kept for legacy. Moved to https://github.com/llvm/llvm-project
tomgoldstein/loss-landscape
Code for visualizing the loss landscape of neural nets
dendibakh/perf-ninja
This is an online course where you can learn and master the skill of low-level performance analysis and tuning.
google/XNNPACK
High-efficiency floating-point neural network inference operators for mobile, server, and Web
aantron/better-enums
C++ compile-time enum to string, iteration, in a single header file
bloomberg/bde
Basic Development Environment - a set of foundational C++ libraries used at Bloomberg.
novak-99/MLPP
A library created to revitalize C++ as a machine learning front end. Per aspera ad astra.
NervanaSystems/maxas
Assembler for NVIDIA Maxwell architecture
MingSun-Tse/Efficient-Deep-Learning
Collection of recent methods on (deep) neural network compression and acceleration.
cginternals/cmake-init
Template for reliable, cross-platform C++ project setup using cmake.
ikalnytskyi/termcolor
Termcolor is a header-only C++ library for printing colored messages to the terminal. Written just for fun with a help of the Force.
Machine-Learning-Tokyo/papers-with-annotations
Research papers with annotations, illustrations and explanations
NVIDIA/jitify
A single-header C++ library for simplifying the use of CUDA Runtime Compilation (NVRTC).
NVIDIA/cudnn-frontend
cudnn_frontend provides a c++ wrapper for the cudnn backend API and samples on how to use it
milakov/int_fastdiv
Fast integer division with divisor not known at compile time. To be used primarily in CUDA kernels.
pdziepak/sopt