Pinned Repositories
bagua
Bagua Speeds up PyTorch
bagua-net
High performance NCCL plugin for Bagua.
ACM-ICPC-api-service
ACM-ICPC-frontend
blueprint-trainer
Scaffolding for sequence model training research.
c4-dataset-script
Inspired by google c4, here is a series of colossal clean data cleaning scripts focused on CommonCrawl data processing. Including Chinese data processing and cleaning methods in MassiveText.
mamba-jax
megabyte
A PyTorch implementation of MEGABYTE. This multi-scale transformer architecture has the excellent features of tokenization-free and sub-quadratic attention. The paper link: https://arxiv.org/abs/2305.07185
ocr_game
shu
中文书籍收录整理, Collection of Chinese Books
shjwudp's Repositories
shjwudp/shu
中文书籍收录整理, Collection of Chinese Books
shjwudp/c4-dataset-script
Inspired by google c4, here is a series of colossal clean data cleaning scripts focused on CommonCrawl data processing. Including Chinese data processing and cleaning methods in MassiveText.
shjwudp/mamba-jax
shjwudp/megabyte
A PyTorch implementation of MEGABYTE. This multi-scale transformer architecture has the excellent features of tokenization-free and sub-quadratic attention. The paper link: https://arxiv.org/abs/2305.07185
shjwudp/blueprint-trainer
Scaffolding for sequence model training research.
shjwudp/apex
A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch
shjwudp/bagua-core
Core communication lib for Bagua.
shjwudp/BLOOM-COT
Ongoing research training transformer language models at scale, including: BERT & GPT-2
shjwudp/conversational-datasets
shjwudp/do-we-need-attention
shjwudp/GLM-130B
GLM-130B: An Open Bilingual Pre-Trained Model
shjwudp/gpt-neox
An implementation of model parallel autoregressive transformers on GPUs, based on the DeepSpeed library.
shjwudp/GPU-math
🤯 GPU math & benchmarks, branched from mli / transformers-benchmarks
shjwudp/grouped_gemm
PyTorch bindings for CUTLASS grouped GEMM.
shjwudp/Huggingface-Model-Service
shjwudp/hyena-jax
JAX/Flax implementation of the Hyena Hierarchy
shjwudp/juicefs
JuiceFS is a distributed POSIX file system built on top of Redis and S3.
shjwudp/MEGABYTE-pytorch
Implementation of MEGABYTE, Predicting Million-byte Sequences with Multiscale Transformers, in Pytorch
shjwudp/Megatron-LM
Ongoing research training transformer language models at scale, including: BERT & GPT-2
shjwudp/NeMo
NeMo: a toolkit for conversational AI
shjwudp/OptimalShardedDataParallel
An automated parallel training system that combines the advantages from both data and model parallelism. If you have any interests, please visit/star/fork https://github.com/Youhe-Jiang/OptimalShardedDataParallel
shjwudp/RWKV-LM
RWKV is a RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding.
shjwudp/S5
shjwudp/safari
Convolutions for Sequence Modeling
shjwudp/shjwudp.github.io
shjwudp/TimeChamber
A Massively Parallel Large Scale Self-Play Framework
shjwudp/tinygrad
You like pytorch? You like micrograd? You love tinygrad! ❤️
shjwudp/Titans
A collection of models built with ColossalAI
shjwudp/transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
shjwudp/twitter-dialogue