hnyls2002
@acm-21, RA @ucbrise, member @lm-sys @sgl-project Talk is cheap, show show way...
SJTU, UCBBerkeley
hnyls2002's Stars
xai-org/grok-1
Grok open release
oobabooga/text-generation-webui
A Gradio web UI for Large Language Models with support for multiple inference backends.
astral-sh/ruff
An extremely fast Python linter and code formatter, written in Rust.
MonitorControl/MonitorControl
🖥 Control your display's brightness & volume on your Mac as if it was a native Apple Display. Use Apple Keyboard keys or custom shortcuts. Shows the native macOS OSDs.
haotian-liu/LLaVA
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
microsoft/unilm
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
guidance-ai/guidance
A guidance language for controlling large language models.
dottxt-ai/outlines
Structured Text Generation
abetlen/llama-cpp-python
Python bindings for llama.cpp
mamba-org/mamba
The Fast Cross-Platform Package Manager
sgl-project/sglang
SGLang is a fast serving framework for large language models and vision language models.
lark-parser/lark
Lark is a parsing toolkit for Python, built with a focus on ergonomics, performance and modularity.
XuehaiPan/nvitop
An interactive NVIDIA-GPU process viewer and beyond, the one-stop solution for GPU process management.
openxla/xla
A machine learning compiler for GPUs, CPUs, and ML accelerators
rustcc/writing-an-os-in-rust
《使用Rust编写操作系统》
Niek/chatgpt-web
ChatGPT web interface using the OpenAI API
S-LoRA/S-LoRA
S-LoRA: Serving Thousands of Concurrent LoRA Adapters
flexflow/FlexFlow
FlexFlow Serve: Low-Latency, High-Performance LLM Serving
Liu-xiandong/How_to_optimize_in_GPU
This is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several basic kernel optimizations, including: elementwise, reduce, sgemv, sgemm, etc. The performance of these kernels is basically at or near the theoretical limit.
skyzh/write-you-a-vector-db
A Vector Database Tutorial (over CMU-DB's BusTub system)
skyzh/chicv
A minimal and fully-customizable CV template for Typst.
efeslab/Atom
[MLSys'24] Atom: Low-bit Quantization for Efficient and Accurate LLM Serving
IlyaGrebnov/libsais
libsais is a library for linear time suffix array, longest common prefix array and burrows wheeler transform construction based on induced sorting algorithm.
FasterDecoding/BitDelta
FasterDecoding/REST
REST: Retrieval-Based Speculative Decoding, NAACL 2024
matchy233/typst-chi-cv-template
😍 Rip-off of rip-off of skyzh's CV, using typst
mkuchnik/relm
ReLM is a Regular Expression engine for Language Models
Intsights/PySubstringSearch
Python library for fast substring/pattern search written in C++ leveraging Suffix Array Algorithm
yichuan520030910320/MLsys_reading_list
A record of reading list on some MLsys popular topic
ModelTC/general-sam-py
Python bindings for general-sam and some utilities