Pinned Repositories
yejingxin's Repositories
yejingxin/minAone
yejingxin/ai-on-gke
AI on GKE is a collection of examples, best-practices, and prebuilt solutions to help build, deploy, and scale AI Platforms on Google Kubernetes Engine
yejingxin/diffusers
🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch
yejingxin/jax
Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more
yejingxin/HighPerfLLMs2024
yejingxin/jaxformer
Minimal library to train LLMs on TPU in JAX with pjit().
yejingxin/litgpt
20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.
yejingxin/maxtext
A simple, performant and scalable Jax LLM!
yejingxin/Megatron-LM
Ongoing research training transformer models at scale
yejingxin/ml-testing-accelerators
Testing framework for Deep Learning models (Tensorflow and PyTorch) on Google Cloud hardware accelerators (TPU and GPU)
yejingxin/NeMo
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
yejingxin/nvidia-resiliency-ext
NVIDIA Resiliency Extension is a python package for framework developers and users to implement fault-tolerant features. It improves the effective training time by minimizing the downtime due to failures and interruptions.
yejingxin/orbax
Orbax provides common utility libraries for JAX users.
yejingxin/paxml
yejingxin/pytorch-lightning
Pretrain, finetune and deploy AI models on multiple GPUs, TPUs with zero code changes.
yejingxin/ray
Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a toolkit of libraries (Ray AIR) for accelerating ML workloads.
yejingxin/saxml
yejingxin/serving
A flexible, high-performance serving system for machine learning models
yejingxin/t5x
yejingxin/tf-lingvo
Lingvo
yejingxin/TinyLlama
The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.
yejingxin/torchtitan
A native PyTorch Library for large model training
yejingxin/tpu-tools
Reference models and tools for Cloud TPUs.
yejingxin/UCSD_BigData
A repository for scripts and notebooks for the UCSD big data course
yejingxin/veScale
A PyTorch Native LLM Training Framework
yejingxin/xla
Enabling PyTorch on Google TPU
yejingxin/xpk
xpk (Accelerated Processing Kit, pronounced x-p-k,) is a software tool to help Cloud developers to orchestrate training jobs on accelerators such as TPUs and GPUs on GKE.
yejingxin/yejingxin.github.io