zhjiang22

May we meet by chance, and smiling.

Shatin, NT, Hong Kong

zhjiang22's Stars

OpsPAI/TraceZip
This repository manifests set which is made to build a prototype system of TraceZip, made by 4 pieces.
Language:Go10
openai/openai-agents-python
A lightweight, powerful framework for multi-agent workflows
Language:Python6.5k660
dyweb/papers-notebook
:page_facing_up: :cn: :page_with_curl: 论文阅读笔记（分布式系统、虚拟化、机器学习）Papers Notebook (Distributed System, Virtualization, Machine Learning)
2.2k251
ldos-project/TraceLLM
Language:Python4
deepseek-ai/open-infra-index
Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation
6.8k203
snowie2000/mactype
Better font rendering for Windows.
Language:C++10.6k445
JetBrains-Research/stack-trace-deduplication
Code for the embedding and reranker models, as well for evaluation from the paper "Stack Trace Deduplication: Faster, More Accurately, and in More Realistic Scenarios".
Language:Python4
robusta-dev/holmesgpt
Your 24/7 On-Call AI Agent - Solve Alerts Faster with Automatic Correlations, Investigations, and More
Language:Python73275
Faustinaqq/CKAAD
Language:Python11
TsinghuaDatabaseGroup/DB-GPT
An LLM Based Diagnosis System (https://arxiv.org/pdf/2312.01454.pdf)
Language:Python61586
punica-ai/punica
Serving multiple LoRA finetuned LLM as one
Language:Python1k49
mit-pdos/sigmaos
Language:Go12913
dywsjtu/apparate
Artifact for "Apparate: Rethinking Early Exits to Tame Latency-Throughput Tensions in ML Serving" [SOSP '24]
Language:Python222
muse-research-lab/llm-inference-workload-eval
This repository contains all the code used for the experimental analysis of the paper: The Importance of Workload Choice in Evaluating LLM Inference Systems.
Language:Python4
zhengzangw/Sequence-Scheduling
PyTorch implementation of paper "Response Length Perception and Sequence Scheduling: An LLM-Empowered LLM Inference Pipeline".
Language:Python8417
microsoft/ParrotServe
[OSDI'24] Serving LLM-based Applications Efficiently with Semantic Variable
Language:Python1499
lunyiliu/LogLM
From Task-based to Instruction-based Automated Log Analysis
Language:Python73
Jun-jie-Huang/LoFI
Source Code for ISSRE-24 paper "Demystifying and Extracting Fault-indicating Information from Logs for Failure Diagnosis".
Language:Python81
skypilot-org/skypilot
SkyPilot: Run AI and batch jobs on any infra (Kubernetes or 15+ clouds). Get unified execution, cost savings, and high GPU availability via a simple interface.
Language:Python7.5k593
fastapi/fastapi
FastAPI framework, high performance, easy to learn, fast to code, ready for production
Language:Python82.2k7.1k
LLMServe/DistServe
Disaggregated serving system for Large Language Models (LLMs).
Language:Jupyter Notebook49652
IBM/LLM-performance-prediction
Predict the performance of LLM inference services
Language:Jupyter Notebook15
fmperf-project/fmperf
Cloud Native Benchmarking of Foundation Models
Language:Python2410
LMCache/LMCache
10x Faster Long-Context LLM By Smart KV Cache Optimizations
Language:Python61565
TencentARC/BrushNet
[ECCV 2024] The official implementation of paper "BrushNet: A Plug-and-Play Image Inpainting Model with Decomposed Dual-Branch Diffusion"
Language:Python1.5k125
casys-kaist/LLMServingSim
LLMServingSim: A HW/SW Co-Simulation Infrastructure for LLM Inference Serving at Scale
Language:Python9412
microsoft/vidur
A large-scale simulation framework for LLM inference
Language:Python34761
LoongServe/LoongServe
Language:Jupyter Notebook887
aiwaves-cn/agents
An Open-source Framework for Data-centric, Self-evolving Autonomous Language Agents
Language:Python5.5k439
WebFuzzing/EvoMaster
The first open-source AI-driven tool for automatically generating system-level test cases (also known as fuzzing) for web/enterprise applications. Currently targeting whitebox and blackbox testing of Web APIs, like REST, GraphQL and RPC (e.g., gRPC and Thrift).
Language:Kotlin55492

zhjiang22

zhjiang22's Stars

OpsPAI/TraceZip

openai/openai-agents-python

dyweb/papers-notebook

ldos-project/TraceLLM

deepseek-ai/open-infra-index

snowie2000/mactype

JetBrains-Research/stack-trace-deduplication

robusta-dev/holmesgpt

Faustinaqq/CKAAD

TsinghuaDatabaseGroup/DB-GPT

punica-ai/punica

mit-pdos/sigmaos

dywsjtu/apparate

muse-research-lab/llm-inference-workload-eval

zhengzangw/Sequence-Scheduling

microsoft/ParrotServe

lunyiliu/LogLM

Jun-jie-Huang/LoFI

skypilot-org/skypilot

fastapi/fastapi

LLMServe/DistServe

IBM/LLM-performance-prediction

fmperf-project/fmperf

LMCache/LMCache

TencentARC/BrushNet

casys-kaist/LLMServingSim

microsoft/vidur

LoongServe/LoongServe

aiwaves-cn/agents

WebFuzzing/EvoMaster