zhjiang22's Stars
OpsPAI/TraceZip
This repository manifests set which is made to build a prototype system of TraceZip, made by 4 pieces.
openai/openai-agents-python
A lightweight, powerful framework for multi-agent workflows
dyweb/papers-notebook
:page_facing_up: :cn: :page_with_curl: 论文阅读笔记(分布式系统、虚拟化、机器学习)Papers Notebook (Distributed System, Virtualization, Machine Learning)
ldos-project/TraceLLM
deepseek-ai/open-infra-index
Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation
snowie2000/mactype
Better font rendering for Windows.
JetBrains-Research/stack-trace-deduplication
Code for the embedding and reranker models, as well for evaluation from the paper "Stack Trace Deduplication: Faster, More Accurately, and in More Realistic Scenarios".
robusta-dev/holmesgpt
Your 24/7 On-Call AI Agent - Solve Alerts Faster with Automatic Correlations, Investigations, and More
Faustinaqq/CKAAD
TsinghuaDatabaseGroup/DB-GPT
An LLM Based Diagnosis System (https://arxiv.org/pdf/2312.01454.pdf)
punica-ai/punica
Serving multiple LoRA finetuned LLM as one
mit-pdos/sigmaos
dywsjtu/apparate
Artifact for "Apparate: Rethinking Early Exits to Tame Latency-Throughput Tensions in ML Serving" [SOSP '24]
muse-research-lab/llm-inference-workload-eval
This repository contains all the code used for the experimental analysis of the paper: The Importance of Workload Choice in Evaluating LLM Inference Systems.
zhengzangw/Sequence-Scheduling
PyTorch implementation of paper "Response Length Perception and Sequence Scheduling: An LLM-Empowered LLM Inference Pipeline".
microsoft/ParrotServe
[OSDI'24] Serving LLM-based Applications Efficiently with Semantic Variable
lunyiliu/LogLM
From Task-based to Instruction-based Automated Log Analysis
Jun-jie-Huang/LoFI
Source Code for ISSRE-24 paper "Demystifying and Extracting Fault-indicating Information from Logs for Failure Diagnosis".
skypilot-org/skypilot
SkyPilot: Run AI and batch jobs on any infra (Kubernetes or 15+ clouds). Get unified execution, cost savings, and high GPU availability via a simple interface.
fastapi/fastapi
FastAPI framework, high performance, easy to learn, fast to code, ready for production
LLMServe/DistServe
Disaggregated serving system for Large Language Models (LLMs).
IBM/LLM-performance-prediction
Predict the performance of LLM inference services
fmperf-project/fmperf
Cloud Native Benchmarking of Foundation Models
LMCache/LMCache
10x Faster Long-Context LLM By Smart KV Cache Optimizations
TencentARC/BrushNet
[ECCV 2024] The official implementation of paper "BrushNet: A Plug-and-Play Image Inpainting Model with Decomposed Dual-Branch Diffusion"
casys-kaist/LLMServingSim
LLMServingSim: A HW/SW Co-Simulation Infrastructure for LLM Inference Serving at Scale
microsoft/vidur
A large-scale simulation framework for LLM inference
LoongServe/LoongServe
aiwaves-cn/agents
An Open-source Framework for Data-centric, Self-evolving Autonomous Language Agents
WebFuzzing/EvoMaster
The first open-source AI-driven tool for automatically generating system-level test cases (also known as fuzzing) for web/enterprise applications. Currently targeting whitebox and blackbox testing of Web APIs, like REST, GraphQL and RPC (e.g., gRPC and Thrift).