roG0d

Spain

roG0d's Stars

kyutai-labs/moshi
Language:Python3.4k219
LMCache/LMCache
Language:Python754
zml/zml
High performance AI inference stack. Built for production. @ziglang / @openxla / MLIR / @bazelbuild
Language:Zig1.1k30
microsoft/generative-ai-for-beginners
18 Lessons, Get Started Building with Generative AI 🔗 https://microsoft.github.io/generative-ai-for-beginners/
Language:Jupyter Notebook61.9k31.6k
langchain-ai/langchain
🦜🔗 Build context-aware reasoning applications
Language:Jupyter Notebook92.5k14.8k
run-llama/llama_index
LlamaIndex is a data framework for your LLM applications
Language:Python35.5k5k
ollama/ollama
Get up and running with Llama 3.1, Mistral, Gemma 2, and other large language models.
Language:Go90k7.1k
ggerganov/llama.cpp
LLM inference in C/C++
Language:C++65.2k9.3k
pytorch/serve
Serve, optimize and scale PyTorch models in production
Language:Java4.2k842
NVIDIA/TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
Language:C++8.2k913
facebookresearch/segment-anything-2
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
Language:Jupyter Notebook10.9k896
0xSh4dy/learning_llvm
Language:C++272
huggingface/transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Language:Python132k26.3k
AutoGPTQ/AutoGPTQ
An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.
Language:Python4.3k463
casper-hansen/AutoAWQ
AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:
Language:Python1.6k196
mit-han-lab/llm-awq
[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
Language:Python2.3k178
IST-DASLab/marlin
FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.
Language:Python56345
facebookresearch/xformers
Hackable and optimized Transformers building blocks, supporting a composable construction.
Language:Python8.4k593
Dao-AILab/flash-attention
Fast and memory-efficient exact attention
Language:Python13.5k1.2k
dottxt-ai/outlines
Structured Text Generation
Language:Python8.3k422
jeroenvlek/gpt-from-scratch-rs
Andrej Karpathy's Let's build GPT: from scratch video & notebook implemented in Rust + candle
Language:Rust544
apple/corenet
CoreNet: A library for training deep neural networks
Language:Python6.9k538
ToluClassics/candle-tutorial
Tutorial for Porting PyTorch Transformer Models to Candle (Rust)
Language:Rust23612
Syllo/nvtop
GPU & Accelerator process monitoring for AMD, Apple, Huawei, Intel, NVIDIA and Qualcomm
Language:C8k291
openai/transformer-debugger
Language:Python4k233
karpathy/nanoGPT
The simplest, fastest repository for training/finetuning medium-sized GPTs.
Language:Python36.3k5.7k
Byron/dua-cli
View disk space usage and delete unwanted data, fast.
Language:Rust4k109
sxyazi/yazi
💥 Blazing fast terminal file manager written in Rust, based on async I/O.
Language:Rust14.7k332
atuinsh/atuin
✨ Magical shell history
Language:Rust20.2k549
compiler-explorer/compiler-explorer
Run compilers interactively from your web browser and interact with the assembly
Language:TypeScript16k1.7k

roG0d

roG0d's Stars

kyutai-labs/moshi

LMCache/LMCache

zml/zml

microsoft/generative-ai-for-beginners

langchain-ai/langchain

run-llama/llama_index

ollama/ollama

ggerganov/llama.cpp

pytorch/serve

NVIDIA/TensorRT-LLM

facebookresearch/segment-anything-2

0xSh4dy/learning_llvm

huggingface/transformers

AutoGPTQ/AutoGPTQ

casper-hansen/AutoAWQ

mit-han-lab/llm-awq

IST-DASLab/marlin

facebookresearch/xformers

Dao-AILab/flash-attention

dottxt-ai/outlines

jeroenvlek/gpt-from-scratch-rs

apple/corenet

ToluClassics/candle-tutorial

Syllo/nvtop

openai/transformer-debugger

karpathy/nanoGPT

Byron/dua-cli

sxyazi/yazi

atuinsh/atuin

compiler-explorer/compiler-explorer