triton
There are 137 repositories under triton topic.
linkedin/Liger-Kernel
Efficient Triton Kernels for LLM Training
ELS-RD/kernl
Kernl lets you run PyTorch transformer models several times faster on GPU with a single line of code, and is designed to be easily hackable.
DefTruth/CUDA-Learn-Notes
📚Modern CUDA Learn Notes with PyTorch: Tensor/CUDA Cores, 📖150+ CUDA Kernels with PyTorch bindings, 📖HGEMM/SGEMM (95%~99% cuBLAS performance), 📖100+ LLM/CUDA Blogs.
TritonDataCenter/containerpilot
A service for autodiscovery and configuration of applications running in containers
JonathanSalwan/Tigress_protection
Playing with the Tigress software protection. Break some of its protections and solve their reverse engineering challenges. Automatic deobfuscation using symbolic execution, taint analysis and LLVM.
BobMcDear/attorch
A subset of PyTorch's neural network modules, written in Python using OpenAI's Triton.
JafarAkhondali/acer-predator-turbo-and-rgb-keyboard-linux-module
Linux kernel module to support Turbo mode and RGB Keyboard for Acer Predator notebook series
FlagOpen/FlagGems
FlagGems is an operator library for large language models implemented in Triton Language.
d4em0n/exrop
Automatic ROPChain Generation
opendilab/DI-hpc
OpenDILab RL HPC OP Lib, including CUDA and Triton kernel
SQLab/symgdb
SymGDB - symbolic execution plugin for gdb
Colton1skees/Dna
LLVM based static binary analysis framework
kakaobrain/trident
A performance library for machine learning applications.
mmsaeed509/bspwm-dots
Ozoz dotfiles for bspwm, i3WM
allegroai/clearml-serving
ClearML - Model-Serving Orchestration and Repository Solution
novioleo/Savior
(WIP)The deployment framework aims to provide a simple, lightweight, fast integrated, pipelined deployment framework for algorithm service that ensures reliability, high concurrency and scalability of services.
alphaSeclab/DBI-Stuff
Resources About Dynamic Binary Instrumentation and Dynamic Binary Analysis
NVIDIA-ISAAC-ROS/isaac_ros_object_detection
NVIDIA-accelerated, deep learned model support for image space object detection
NVIDIA-ISAAC-ROS/isaac_ros_dnn_inference
NVIDIA-accelerated DNN model inference ROS 2 packages using NVIDIA Triton/TensorRT for both Jetson and x86_64 with CUDA-capable GPU
notAI-tech/fastDeploy
Deploy DL/ ML inference pipelines with minimal extra code.
alexzhang13/flashattention2-custom-mask
Triton implementation of FlashAttention2 that adds Custom Masks.
triton/triton
Triton Operating System
ergrelet/triton-bn
Binary Ninja plugin that can be used to apply Triton's dead store eliminitation pass on basic blocks or functions.
redis-developer/redis-nvidia-recsys
Three examples of recommendation system pipelines with NVIDIA Merlin and Redis
kyegomez/EXA-1
An EXA-Scale repository of Multi-Modality AI resources from papers and models, to foundational libraries!
MarineBioAcousticsRC/Triton
:whale: Scripps Whale Acoustics Lab :earth_americas: Scripps Acoustic Ecology Lab - Triton with remoras in development
suvash/nixos-nvidia-cuda-python-docker-compose
A step-by-step guide to setting up Nvidia GPUs with CUDA support running on Docker (and Compose) containers on NixOS host
Lallapallooza/fast-audiomentations
⚡ Blazing fast audio augmentation in Python, powered by GPU for high-efficiency processing in machine learning and audio analysis tasks.
dame-cell/Triformer
Transformers components but in Triton
mustakimur/COIN-Attacks
COIN Attacks: on Insecurity of Enclave Untrusted Interfaces in SGX - ASPLOS 2020
cosine0/amphitrite
Symbolic debugging tool using JonathanSalwan/Triton
mustakimur/CFI-LB
Adaptive Callsite-sensitive Control Flow Integrity - EuroS&P'19
Colton1skees/TritonTranslator
Standalone static version of Triton's x86/x64 translator
DeepAuto-AI/hip-attention
Training-free Post-training Efficient Sub-quadratic Complexity Attention. Implemented with OpenAI Triton.
ndtands/Speed_up_Model
Increase the inference speed of the model