Irvingwangjr's Stars
uber-go/automaxprocs
Automatically set GOMAXPROCS to match Linux container CPU quota.
alibaba/open-local
cloud-native local storage management system for stateful workload, low-latency with simplicity
sgl-project/sglang
SGLang is a fast serving framework for large language models and vision language models.
intel/llm-on-ray
Pretrain, finetune and serve LLMs on Intel platforms with Ray
Netflix/asgard
[Asgard is deprecated at Netflix. We use Spinnaker ( www.spinnaker.io ).] Web interface for application deployments and cloud management in Amazon Web Services (AWS). Binary download: http://github.com/Netflix/asgard/releases
NVIDIA/NeMo
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
v6d-io/v6d
vineyard (v6d): an in-memory immutable data manager. (Project under CNCF, TAG-Storage)
NVIDIA/NeMo-Aligner
Scalable toolkit for efficient model alignment
open-telemetry/opentelemetry-specification
Specifications for OpenTelemetry
InternLM/xtuner
An efficient, flexible and full-featured toolkit for fine-tuning LLM (InternLM2, Llama3, Phi3, Qwen, Mistral, ...)
grpc/grpc-go
The Go language implementation of gRPC. HTTP/2 based RPC
mindprince/gonvml
NVIDIA Management Library (NVML) bindings for Go
TUDB-Labs/mLoRA
An Efficient "Factory" to Build Multiple LoRA Adapters
openkruise/rollouts
Enhanced Rollouts features for application automation.
kudobuilder/kuttl
KUbernetes Test TooL (kuttl)
punica-ai/punica
Serving multiple LoRA finetuned LLM as one
cupy/cupy
NumPy & SciPy for GPU
NVIDIA-Merlin/HierarchicalKV
HierarchicalKV is a part of NVIDIA Merlin and provides hierarchical key-value storage to meet RecSys requirements. The key capability of HierarchicalKV is to store key-value feature-embeddings on high-bandwidth memory (HBM) of GPUs and in host memory. It also can be used as a generic key-value storage.
google/tensorstore
Library for reading and writing large multi-dimensional arrays.
google/nsjail
A lightweight process isolation tool that utilizes Linux namespaces, cgroups, rlimits and seccomp-bpf syscall filters, leveraging the Kafel BPF language for enhanced security.
containers/nri-plugins
A collection of community maintained NRI plugins
NVIDIA/cuda-checkpoint
CUDA checkpoint and restore utility
efficient/rdma_bench
A framework to understand RDMA
kubernetes-csi/csi-driver-host-path
A sample (non-production) CSI Driver that creates a local directory as a volume on a single node
facebookresearch/fairscale
PyTorch extensions for high performance and large scale training.
Plan9-Archive/plan9-4e
Mirror of Plan 9 4th Edition from p9f
NVIDIA/TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
msgpack/msgpack
MessagePack is an extremely efficient object serialization library. It's like JSON, but very fast and small.
Tradias/asio-grpc
Asynchronous gRPC with Asio/unified executors
intel/pcm
Intel® Performance Counter Monitor (Intel® PCM)