Wwiit

Wwiit's Stars

vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Language:Python31.2k 253 5.4k4.7k
Stability-AI/generative-models
Generative Models by Stability AI
Language:Python24.8k 258 3112.7k
Mozilla-Ocho/llamafile
Distribute and run LLMs with a single file.
Language:C++20.7k 176 4381.1k
abseil/abseil-cpp
Abseil Common Libraries (C++)
Language:C++15.1k 592 8832.6k
Dao-AILab/flash-attention
Fast and memory-efficient exact attention
Language:Python14.4k 122 1.1k1.4k
naklecha/llama3-from-scratch
llama3 implementation one matrix multiplication at a time
Language:Jupyter Notebook13.8k 98 181.1k
intel-analytics/ipex-llm
Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, Mixtral, Gemma, Phi, MiniCPM, Qwen-VL, MiniCPM-V, etc.) on Intel XPU (e.g., local PC with iGPU and NPU, discrete GPU such as Arc, Flex and Max); seamlessly integrate with llama.cpp, Ollama, HuggingFace, LangChain, LlamaIndex, vLLM, GraphRAG, DeepSpeed, Axolotl, etc
Language:Python6.8k 252 2.6k1.3k
google/gemma.cpp
lightweight, standalone C++ inference engine for Google's Gemma models.
Language:C++6k 40 88511
NVIDIA/FasterTransformer
Transformer related optimization, including BERT, GPT
Language:C++5.9k 62 625894
bbycroft/llm-viz
3D Visualization of an GPT-style LLM
Language:TypeScript4.1k 33 14450
luban-agi/Awesome-AIGC-Tutorials
Curated tutorials and resources for Large Language Models, AI Painting, and more.
3.9k 29 2262
eliben/pycparser
:snake: Complete C99 parser in pure Python
Language:Python3.3k 93 368612
RainerKuemmerle/g2o
g2o: A General Framework for Graph Optimization
Language:C++3.1k 114 5511.1k
CVCUDA/CV-CUDA
CV-CUDA™ is an open-source, GPU accelerated library for cloud-scale image processing and computer vision.
Language:C++2.4k 47 169216
megvii-research/NAFNet
The state-of-the-art image restoration model without nonlinear activation functions.
Language:Python2.3k 21 146285
pytorch/FBGEMM
FB (Facebook) + GEMM (General Matrix-Matrix Multiplication) - https://code.fb.com/ml-applications/fbgemm/
Language:C++1.2k 66 171508
VirtualGL/virtualgl
Main VirtualGL repository
Language:C++701 51 245106
PacktPublishing/Hands-On-GPU-Accelerated-Computer-Vision-with-OpenCV-and-CUDA
Hands-On GPU Accelerated Computer Vision with OpenCV and CUDA, published by Packt
Language:C++632 22 6227
KnowingNothing/compiler-and-arch
A list of tutorials, paper, talks, and open-source projects for emerging compiler and architecture
397 23 035
openai/openai-gemm
Open single and half precision gemm implementations
Language:C374 194 985
jeffhammond/STREAM
STREAM benchmark
Language:C350 16 5137
mlcommons/algorithmic-efficiency
MLCommons Algorithmic Efficiency is a benchmark and competition measuring neural network training speedups due to algorithmic improvements in both training algorithms and models.
Language:Python335 24 22571
OpenImageDebugger/OpenImageDebugger
An advanced in-memory image visualization plugin for GDB and LLDB on Linux, with experimental support for MacOS and Windows. Previously known as gdb-imagewatch.
Language:C++220 8 5542
BBuf/how-to-optimize-gemm
Language:C93 3 621
ROCm/hipBLASLt
hipBLASLt is a library that provides general matrix-matrix operations with a flexible API and extends functionalities beyond a traditional BLAS library
Language:Assembly65 17 3889
Jokeren/GPA
GPU Performance Advisor
Language:Python63 5 48
guanrenyang/Programming-Massively-Parallel-Processors
Solution of Programming Massively Parallel Processors
Language:C++31 1 15
carlushuang/gcnasm
amdgpu example code in hip/asm
Language:Assembly21 3 015
berenger-eu/farm-sve
The Farm-SVE package provides a header that implements the ARM C language extensions (ACLE) for the ARM Scalable Vector Extension (SVE) in standard C++.
Language:C++13 2 32
jaredhoberock/shmalloc
Dynamic __shared__ memory allocation for CUDA
Language:C++6 3 0