Pinned Repositories
flash-attention
Fast and memory-efficient exact attention
llama.cpp
LLM inference in C/C++
agenta
The all-in-one LLMOps platform: prompt management, evaluation, human feedback, and deployment all in one place.
axolotl
Go ahead and axolotl questions
golib
Open version of common golang libraries useful to many projects.
llama.cpp
Port of Facebook's LLaMA model in C/C++
lm-evaluation-harness
A framework for few-shot evaluation of autoregressive language models.
sweep
Sweep: AI-powered Junior Developer for small features and bug fixes.
vllm-deepseek
A high-throughput and memory-efficient inference and serving engine for LLMs
vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
seungduk-yanolja's Repositories
seungduk-yanolja/vllm-deepseek
A high-throughput and memory-efficient inference and serving engine for LLMs
seungduk-yanolja/agenta
The all-in-one LLMOps platform: prompt management, evaluation, human feedback, and deployment all in one place.
seungduk-yanolja/axolotl
Go ahead and axolotl questions
seungduk-yanolja/golib
Open version of common golang libraries useful to many projects.
seungduk-yanolja/llama.cpp
Port of Facebook's LLaMA model in C/C++
seungduk-yanolja/lm-evaluation-harness
A framework for few-shot evaluation of autoregressive language models.
seungduk-yanolja/sweep
Sweep: AI-powered Junior Developer for small features and bug fixes.