Darth-Kronos

Nobody

North Carolina State UniversityRaleigh, NC

Darth-Kronos's Stars

lyuwenyu/RT-DETR
[CVPR 2024] Official RT-DETR (RTDETR paddle pytorch), Real-Time DEtection TRansformer, DETRs Beat YOLOs on Real-time Object Detection. 🔥 🔥 🔥
Language:Python2.3k266
microsoft/onnxruntime-genai
Generative AI extensions for onnxruntime
Language:C++450108
josephmisiti/awesome-machine-learning
A curated list of awesome Machine Learning frameworks, libraries and software.
Language:Python65.6k14.6k
quic/aimet-model-zoo
Language:Python29553
pytorch/TensorRT
PyTorch/TorchScript/FX compiler for NVIDIA GPUs using TensorRT
Language:Python2.5k349
microsoft/onnxconverter-common
Common utilities for ONNX converters
Language:Python24766
openvinotoolkit/openvino
OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference
Language:C++7k2.2k
dmlc/dlpack
common in-memory tensor structure
Language:Python895134
roboflow/inference
A fast, easy-to-use, production-ready inference server for computer vision supporting deployment of many popular model architectures and fine-tuned models.
Language:Python1.3k116
NX-AI/vision-lstm
xLSTM as Generic Vision Backbone
Language:Python41228
Dao-AILab/flash-attention
Fast and memory-efficient exact attention
Language:Python13.6k1.3k
NVIDIA/FasterTransformer
Transformer related optimization, including BERT, GPT
Language:C++5.8k889
ModelTC/lightllm
LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.
Language:Python2.4k195
microsoft/Olive
Olive: Simplify ML Model Finetuning, Conversion, Quantization, and Optimization for CPUs, GPUs and NPUs.
Language:Python1.5k163
gpu-mode/lectures
Material for gpu-mode lectures
Language:Jupyter Notebook2.6k263
quic/aimet
AIMET is a library that provides advanced quantization and compression techniques for trained neural network models.
Language:Python2.1k377
quic/ai-hub-models
The Qualcomm® AI Hub Models are a collection of state-of-the-art machine learning models optimized for performance (latency, memory etc.) and ready to deploy on Qualcomm® devices.
Language:Python43960
merrymercy/awesome-tensor-compilers
A list of awesome compiler projects and papers for tensor computation and deep learning.
2.3k297
NVIDIA/TensorRT-Model-Optimizer
TensorRT Model Optimizer is a unified library of state-of-the-art model optimization techniques such as quantization, pruning, distillation, etc. It compresses deep learning models for downstream deployment frameworks like TensorRT-LLM or TensorRT to optimize inference speed on NVIDIA GPUs.
Language:Python45830
openxla/xla
A machine learning compiler for GPUs, CPUs, and ML accelerators
Language:C++2.6k409
unslothai/unsloth
Finetune Llama 3.2, Mistral, Phi & Gemma LLMs 2-5x faster with 80% less memory
Language:Python16.5k1.1k
meta-llama/llama-recipes
Scripts for fine-tuning Meta Llama with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supporting a number of candid inference solutions such as HF TGI, VLLM for local or cloud deployment. Demo apps to showcase Meta Llama for WhatsApp & Messenger.
Language:Jupyter Notebook12k1.8k
meta-llama/llama3
The official Meta Llama 3 GitHub site
Language:Python26.5k3k
run-llama/llama_index
LlamaIndex is a data framework for your LLM applications
Language:Python35.9k5.1k
karpathy/llm.c
LLM training in simple, raw C/CUDA
Language:Cuda23.7k2.7k
pytorch/torchdynamo
A Python-level JIT compiler designed to make unmodified PyTorch programs faster.
Language:Python1k123
aws-neuron/aws-neuron-sdk
Powering AWS purpose-built machine learning chips. Blazing fast and cost effective, natively integrated into PyTorch and TensorFlow and integrated with your favorite AWS services
Language:Python445148
zwang4/awesome-machine-learning-in-compilers
Must read research papers and links to tools and datasets that are related to using machine learning for compilers and systems optimisation
1.4k161
federico-busato/Modern-CPP-Programming
Modern C++ Programming Course (C++03/11/14/17/20/23/26)
Language:HTML11.9k798
hpcaitech/Open-Sora
Open-Sora: Democratizing Efficient Video Production for All
Language:Python21.8k2.1k

Darth-Kronos

Darth-Kronos's Stars

lyuwenyu/RT-DETR

microsoft/onnxruntime-genai

josephmisiti/awesome-machine-learning

quic/aimet-model-zoo

pytorch/TensorRT

microsoft/onnxconverter-common

openvinotoolkit/openvino

dmlc/dlpack

roboflow/inference

NX-AI/vision-lstm

Dao-AILab/flash-attention

NVIDIA/FasterTransformer

ModelTC/lightllm

microsoft/Olive

gpu-mode/lectures

quic/aimet

quic/ai-hub-models

merrymercy/awesome-tensor-compilers

NVIDIA/TensorRT-Model-Optimizer

openxla/xla

unslothai/unsloth

meta-llama/llama-recipes

meta-llama/llama3

run-llama/llama_index

karpathy/llm.c

pytorch/torchdynamo

aws-neuron/aws-neuron-sdk

zwang4/awesome-machine-learning-in-compilers

federico-busato/Modern-CPP-Programming

hpcaitech/Open-Sora