Pinned Repositories
CS516_DataPipeline
docker-python
Kaggle Python docker image
Duke-Tsinghua-MLSS-2017
Duke-Tsinghua Machine Learning Summer School 2017
JetStream
JetStream is a throughput and memory optimized engine for LLM inference on XLA devices, starting with TPUs (and GPUs in future -- PRs welcome).
jetstream-pytorch
PyTorch/XLA integration with JetStream (https://github.com/google/JetStream) for LLM inference"
llama
Inference code for LLaMA models
lsy323.github.io
github pages test
lsy323's Repositories
lsy323/CS516_DataPipeline
lsy323/docker-python
Kaggle Python docker image
lsy323/Duke-Tsinghua-MLSS-2017
Duke-Tsinghua Machine Learning Summer School 2017
lsy323/JetStream
JetStream is a throughput and memory optimized engine for LLM inference on XLA devices, starting with TPUs (and GPUs in future -- PRs welcome).
lsy323/jetstream-pytorch
PyTorch/XLA integration with JetStream (https://github.com/google/JetStream) for LLM inference"
lsy323/llama
Inference code for LLaMA models
lsy323/lsy323.github.io
github pages test
lsy323/ml-auto-solutions
A simplified and automated orchestration workflow to perform ML end-to-end (E2E) model tests and benchmarking on Cloud VMs across different frameworks.
lsy323/ml-testing-accelerators
Testing framework for Deep Learning models (Tensorflow and PyTorch) on Google Cloud hardware accelerators (TPU and GPU)
lsy323/onnx
Open standard for machine learning interoperability
lsy323/openxla-xla
A machine learning compiler for GPUs, CPUs, and ML accelerators
lsy323/pytorch
Tensors and Dynamic neural networks in Python with strong GPU acceleration
lsy323/tensorflow
An Open Source Machine Learning Framework for Everyone
lsy323/stablehlo
Backward compatible ML compute opset inspired by HLO/MHLO
lsy323/tpu_debug
lsy323/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
lsy323/xla
Enabling PyTorch on Google TPU