TharinduRusira's Stars
ggerganov/llama.cpp
LLM inference in C/C++
meta-llama/llama
Inference code for Llama models
karpathy/nanoGPT
The simplest, fastest repository for training/finetuning medium-sized GPTs.
preservim/nerdtree
A tree explorer plugin for vim.
triton-lang/triton
Development repository for the Triton language and compiler
benfred/py-spy
Sampling profiler for Python programs
apache/tvm
Open deep learning compiler stack for cpu, gpu and specialized accelerators
ggerganov/ggml
Tensor library for machine learning
NVIDIA/TensorRT
NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.
pytorch/tutorials
PyTorch tutorials.
jlfwong/speedscope
🔬 A fast, interactive web-based viewer for performance profiles.
facebookincubator/AITemplate
AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (NVIDIA GPU) and MatrixCore (AMD GPU) inference.
pytorch/glow
Compiler for Neural Network hardware accelerators
NVIDIA/nccl
Optimized primitives for collective multi-GPU communication
pytorch/TensorRT
PyTorch/TorchScript/FX compiler for NVIDIA GPUs using TensorRT
intel/intel-extension-for-pytorch
A Python package for extending the official PyTorch that can easily obtain performance on Intel platform
ELS-RD/kernl
Kernl lets you run PyTorch transformer models several times faster on GPU with a single line of code, and is designed to be easily hackable.
pytorch/torchdynamo
A Python-level JIT compiler designed to make unmodified PyTorch programs faster.
graykode/gpt-2-Pytorch
Simple Text-Generator with OpenAI gpt-2 Pytorch Implementation
attractivechaos/kann
A lightweight C library for artificial neural networks
google/ml-compiler-opt
Infrastructure for Machine Learning Guided Optimization (MLGO) in LLVM.
CQCL/tket
Source code for the TKET quantum compiler, Python bindings and utilities
spcl/pymlir
Python interface for MLIR - the Multi-Level Intermediate Representation
hadisinaee/avicenna
a minimal academic page for Hugo
NVlabs/condensa
Programmable Neural Network Compression
dmsl/academic-responsive-template
Academic Responsive (AR) Website Template
harvard-acc/DeepRecSys
http://vlsiarch.eecs.harvard.edu/research/recommendation/
lyeoni/gpt-pytorch
PyTorch Implementation of OpenAI GPT
jeffhammond/nwchem-tce-triples-kernels
NWChem TCE CCSD(T) loop-driven kernels for performance optimization experiments
SAITPublic/PimAiCompiler
This repository is to provide graph mode execution with GPU+PIM in runtime.