AndrewValukhov's Stars
meta-llama/llama-stack
Composable building blocks to build Llama Apps
apple/ml-cross-entropy
deepseek-ai/DeepSeek-V3
ML-SystemDesign/MLSystemDesign
vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
averkij/a-studio
Lingtrain Alignment Studio is an ML based app for texts alignment on different languages. It can produce parallel corpora and parallel books.
vllm-project/llm-compressor
Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM
mrdbourke/simple-local-rag
Build a RAG (Retrieval Augmented Generation) pipeline from scratch and have it all run locally.
stas00/ml-engineering
Machine Learning Engineering Open Book
chameleon-lizard/SkoltechChatBot
Agentic RAG implementation via a Telegram bot.
turboderp/exllama
A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.
NirDiamant/GenAI_Agents
This repository provides tutorials and implementations for various Generative AI Agent techniques, from basic to advanced. It serves as a comprehensive guide for building intelligent, interactive AI systems.
oobabooga/text-generation-webui
A Gradio web UI for Large Language Models with support for multiple inference backends.
huggingface/evaluation-guidebook
Sharing both practical insights and theoretical knowledge about LLM evaluation that we gathered while managing the Open LLM Leaderboard and designing lighteval!
QiuYannnn/Local-File-Organizer
An AI-powered file management tool that ensures privacy by organizing local texts, images. Using Llama3.2 3B and Llava v1.6 models with the Nexa SDK, it intuitively scans, restructures, and organizes files for quick, seamless access and easy retrieval.
Infatoshi/cuda-course
turbo-llm/turbo-alignment
Library for industrial alignment.
alirezadir/Machine-Learning-Interviews
This repo is meant to serve as a guide for Machine Learning/AI technical interviews.
esokolov/ml-course-hse
Машинное обучение на ФКН ВШЭ
ollama/ollama
Get up and running with Llama 3.3, Mistral, Gemma 2, and other large language models.
XiongjieDai/GPU-Benchmarks-on-LLM-Inference
Multiple NVIDIA GPUs or Apple Silicon for Large Language Model Inference?
anthropics/courses
Anthropic's educational courses
udlbook/udlbook
Understanding Deep Learning - Simon J.D. Prince
arj7192/MasteringPyTorchV2
jtsang4/claude-to-chatgpt
This project converts the API of Anthropic's Claude model to the OpenAI Chat API format.
EbookFoundation/free-programming-books
:books: Freely available programming books
HigherOrderCO/Bend
A massively parallel, high-level programming language
siyan-zhao/prepacking
The source code of our work "Prepacking: A Simple Method for Fast Prefilling and Increased Throughput in Large Language Models"
hyintell/RetrievalQA
Source code of the paper: RetrievalQA: Assessing Adaptive Retrieval-Augmented Generation for Short-form Open-Domain Question Answering [Findings of ACL 2024]
schwartz-lab-NLP/TOVA
Token Omission Via Attention