BoxiangW

There are no solutions, there are only trade-offs.

@NVIDIASanta Clara, CA

Pinned Repositories

6.5930-final-project-2023
Language:Jupyter Notebook0 1 00
6s965-fall2022
Language:Jupyter Notebook0 0 00
boxiangw.github.io
My homepage
Language:HTML0 1 00
CS262
Harvard CS262 Introduction to Distributed Computing
Language:Python1 1 00
flash-attention
Fast and memory-efficient exact attention
Language:Python14.4k 120 1.1k1.4k
ColossalAI
Making large AI models cheaper, faster and more accessible
Language:Python38.8k 385 1.7k4.3k
Megatron-LM
Ongoing research training transformer models at scale
Language:Python10.7k 162 7852.4k
NeMo
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
Language:Python12.3k 208 2.3k2.5k
NeMo-Framework-Launcher
Provides end-to-end model development pipelines for LLMs and Multimodal models that can be launched on-prem or cloud-native.
Language:Python474 20 37140
TransformerEngine
A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilization in both training and inference.
Language:Python2k 34 352333

BoxiangW's Repositories

BoxiangW/CS262
Harvard CS262 Introduction to Distributed Computing
Language:Python1 1 00
BoxiangW/6.5930-final-project-2023
Language:Jupyter Notebook0 1 00
BoxiangW/6s965-fall2022
Language:Jupyter Notebook0 0 00
BoxiangW/boxiangw.github.io
My homepage
Language:HTML0 1 00
BoxiangW/ChatGLM-6B
ChatGLM-6B: An Open Bilingual Dialogue Language Model | 开源双语对话语言模型
Language:Python0 0 00
BoxiangW/ColossalAI
Colossal-AI: A Unified Deep Learning System for Large-Scale Parallel Training
Language:Python0 0 00
BoxiangW/ColossalAI-Benchmark
Performance benchmarking with ColossalAI
Language:Python0 0 00
BoxiangW/ColossalAI-Examples
Examples of training models with hybrid parallelism using ColossalAI
Language:Python0 0 00
BoxiangW/ControlNet
Let us control diffusion models!
Language:Python0 0 00
BoxiangW/dalai
The simplest way to run LLaMA on your local machine
Language:CSS0 0 00
BoxiangW/DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Language:Python0 0 00
BoxiangW/latent-diffusion
High-Resolution Image Synthesis with Latent Diffusion Models
Language:Jupyter Notebook0 0 00
BoxiangW/MockingBird
🚀AI拟声: 5秒内克隆您的声音并生成任意语音内容 Clone a voice in 5 seconds to generate arbitrary speech in real-time
Language:Python0 0 00
BoxiangW/Poker
Fully functional Pokerbot that works on PartyPoker, PokerStars and GGPoker, scraping tables with Open-CV (adaptable via gui) or neural network and making decisions based on a genetic algorithm and montecarlo simulation for poker equity calculation. Binaries can be downloaded with this link:
Language:Python0 0 00
BoxiangW/SkyComputing
Sky Computing: Accelerating Geo-distributed Computing in Federated Learning
Language:Python0 0 00
BoxiangW/flash-attention
Fast and memory-efficient exact attention
Language:Python0 0
BoxiangW/llama
Inference code for LLaMA models
Language:Python0 0
BoxiangW/llama.cpp
Port of Facebook's LLaMA model in C/C++
Language:C0 0
BoxiangW/Megatron-LM
Ongoing research training transformer models at scale
Language:Python0 0
BoxiangW/NeMo
NeMo: a framework for generative AI
Language:Python0 0
BoxiangW/NeMo-Megatron-Launcher
NeMo Megatron launcher and tools
Language:Python0 0
BoxiangW/ohmyzsh
🙃 A delightful community-driven (with 2,100+ contributors) framework for managing your zsh configuration. Includes 300+ optional plugins (rails, git, macOS, hub, docker, homebrew, node, php, python, etc), 140+ themes to spice up your morning, and an auto-update tool so that makes it easy to keep up with the latest updates from the community.
Language:Shell0 0
BoxiangW/Pai-Megatron-Patch
The official repo of Pai-Megatron-Patch for LLM & VLM large scale training developed by Alibaba Cloud.
Language:Python
BoxiangW/PCA_linear_autoencoder
Language:Python1 0
BoxiangW/stable-diffusion
A latent text-to-image diffusion model
Language:Jupyter Notebook0 0
BoxiangW/TexasSolver
🚀 A very efficient Texas Holdem GTO solver :spades::hearts::clubs::diamonds:
Language:C++0 0
BoxiangW/TransformerEngine
A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilization in both training and inference.
Language:Python
BoxiangW/triton
Development repository for the Triton language and compiler
Language:C++0 0