triton

There are 137 repositories under triton topic.

linkedin/Liger-Kernel
Efficient Triton Kernels for LLM Training
Language:Python3.5k 40 117207
ELS-RD/kernl
Kernl lets you run PyTorch transformer models several times faster on GPU with a single line of code, and is designed to be easily hackable.
Language:Jupyter Notebook1.5k 29 17494
DefTruth/CUDA-Learn-Notes
📚Modern CUDA Learn Notes with PyTorch: Tensor/CUDA Cores, 📖150+ CUDA Kernels with PyTorch bindings, 📖HGEMM/SGEMM (95%~99% cuBLAS performance), 📖100+ LLM/CUDA Blogs.
Language:Cuda1.5k 13 6162
TritonDataCenter/containerpilot
A service for autodiscovery and configuration of applications running in containers
Language:Go1.1k 83 313134
JonathanSalwan/Tigress_protection
Playing with the Tigress software protection. Break some of its protections and solve their reverse engineering challenges. Automatic deobfuscation using symbolic execution, taint analysis and LLVM.
Language:LLVM812 37 0144
BobMcDear/attorch
A subset of PyTorch's neural network modules, written in Python using OpenAI's Triton.
Language:Python483 10 522
JafarAkhondali/acer-predator-turbo-and-rgb-keyboard-linux-module
Linux kernel module to support Turbo mode and RGB Keyboard for Acer Predator notebook series
Language:C382 20 14371
FlagOpen/FlagGems
FlagGems is an operator library for large language models implemented in Triton Language.
Language:Python343 18 2746
d4em0n/exrop
Automatic ROPChain Generation
Language:Python280 7 823
opendilab/DI-hpc
OpenDILab RL HPC OP Lib, including CUDA and Triton kernel
Language:Python224 3 07
SQLab/symgdb
SymGDB - symbolic execution plugin for gdb
Language:Python215 14 126
Colton1skees/Dna
LLVM based static binary analysis framework
Language:C++191 5 318
kakaobrain/trident
A performance library for machine learning applications.
Language:Python180 4 811
mmsaeed509/bspwm-dots
Ozoz dotfiles for bspwm, i3WM
Language:Shell157 5 310
allegroai/clearml-serving
ClearML - Model-Serving Orchestration and Repository Solution
Language:Python137 11 5740
novioleo/Savior
(WIP)The deployment framework aims to provide a simple, lightweight, fast integrated, pipelined deployment framework for algorithm service that ensures reliability, high concurrency and scalability of services.
Language:Python137 8 128
alphaSeclab/DBI-Stuff
Resources About Dynamic Binary Instrumentation and Dynamic Binary Analysis
130 8 028
NVIDIA-ISAAC-ROS/isaac_ros_object_detection
NVIDIA-accelerated, deep learned model support for image space object detection
Language:C++121 2 2627
NVIDIA-ISAAC-ROS/isaac_ros_dnn_inference
NVIDIA-accelerated DNN model inference ROS 2 packages using NVIDIA Triton/TensorRT for both Jetson and x86_64 with CUDA-capable GPU
Language:C++104 4 3117
notAI-tech/fastDeploy
Deploy DL/ ML inference pipelines with minimal extra code.
Language:Python97 8 617
alexzhang13/flashattention2-custom-mask
Triton implementation of FlashAttention2 that adds Custom Masks.
Language:Python76 5 126
triton/triton
Triton Operating System
Language:Nix63 12 969
ergrelet/triton-bn
Binary Ninja plugin that can be used to apply Triton's dead store eliminitation pass on basic blocks or functions.
Language:C++58 7 34
redis-developer/redis-nvidia-recsys
Three examples of recommendation system pipelines with NVIDIA Merlin and Redis
Language:PureBasic56 6 35
kyegomez/EXA-1
An EXA-Scale repository of Multi-Modality AI resources from papers and models, to foundational libraries!
Language:Jupyter Notebook41 3 01
MarineBioAcousticsRC/Triton
:whale: Scripps Whale Acoustics Lab :earth_americas: Scripps Acoustic Ecology Lab - Triton with remoras in development
Language:MATLAB37 28 2627
suvash/nixos-nvidia-cuda-python-docker-compose
A step-by-step guide to setting up Nvidia GPUs with CUDA support running on Docker (and Compose) containers on NixOS host
Language:Dockerfile37 3 24
Lallapallooza/fast-audiomentations
⚡ Blazing fast audio augmentation in Python, powered by GPU for high-efficiency processing in machine learning and audio analysis tasks.
Language:Python32 3 01
dame-cell/Triformer
Transformers components but in Triton
Language:Python270
mustakimur/COIN-Attacks
COIN Attacks: on Insecurity of Enclave Untrusted Interfaces in SGX - ASPLOS 2020
Language:C++26 3 312
cosine0/amphitrite
Symbolic debugging tool using JonathanSalwan/Triton
Language:Python25 5 07
mustakimur/CFI-LB
Adaptive Callsite-sensitive Control Flow Integrity - EuroS&P'19
Language:C++21 2 67
thanhlnbka/yolov7-triton-deepstream
Language:C++21 1 13
Colton1skees/TritonTranslator
Standalone static version of Triton's x86/x64 translator
Language:C++19 2 06
DeepAuto-AI/hip-attention
Training-free Post-training Efficient Sub-quadratic Complexity Attention. Implemented with OpenAI Triton.
Language:Python19 5 03
ndtands/Speed_up_Model
Increase the inference speed of the model
Language:Python19 1 00

triton

linkedin/Liger-Kernel

ELS-RD/kernl

DefTruth/CUDA-Learn-Notes

TritonDataCenter/containerpilot

JonathanSalwan/Tigress_protection

BobMcDear/attorch

JafarAkhondali/acer-predator-turbo-and-rgb-keyboard-linux-module

FlagOpen/FlagGems

d4em0n/exrop

opendilab/DI-hpc

SQLab/symgdb

Colton1skees/Dna

kakaobrain/trident

mmsaeed509/bspwm-dots

allegroai/clearml-serving

novioleo/Savior

alphaSeclab/DBI-Stuff

NVIDIA-ISAAC-ROS/isaac_ros_object_detection

NVIDIA-ISAAC-ROS/isaac_ros_dnn_inference

notAI-tech/fastDeploy

alexzhang13/flashattention2-custom-mask

triton/triton

ergrelet/triton-bn

redis-developer/redis-nvidia-recsys

kyegomez/EXA-1

MarineBioAcousticsRC/Triton

suvash/nixos-nvidia-cuda-python-docker-compose

Lallapallooza/fast-audiomentations

dame-cell/Triformer

mustakimur/COIN-Attacks

cosine0/amphitrite

mustakimur/CFI-LB

thanhlnbka/yolov7-triton-deepstream

Colton1skees/TritonTranslator

DeepAuto-AI/hip-attention

ndtands/Speed_up_Model