ParCIS Lab, BUPT

Parallel Computing and Intelligent Systems Laboratory (ParCIS Lab), Beijing University of Posts and Telecommunications

China

Pinned Repositories

Chimera
Chimera: bidirectional pipeline parallelism for efficiently training large-scale models.
Language:Python66 1 48
DNN-cpp-proxies
C++/MPI proxies for distributed training of deep neural networks.
Language:C++1 2 00
FlashSparse
FlashSparse significantly reduces the computation redundancy for unstructured sparsity (for SpMM and SDDMM) on Tensor Cores through a Swap-and-Transpose mapping strategy. FlashSparse is accepted by PPoPP 2025.
Language:Cuda10 1 03
Magicube
Magicube is a high-performance library for quantized sparse matrix operations (SpMM and SDDMM) of deep learning on Tensor Cores.
Language:C++89 3 216
Ok-Topk
Ok-Topk is a scheme for distributed training with sparse gradients. Ok-Topk integrates a novel sparse allreduce algorithm (less than 6k communication volume which is asymptotically optimal) with the decentralized parallel Stochastic Gradient Descent (SGD) optimizer, and its convergence is proved theoretically and empirically.
Language:Python25 2 48

ParCIS/Magicube
Magicube is a high-performance library for quantized sparse matrix operations (SpMM and SDDMM) of deep learning on Tensor Cores.
Language:C++89 3 216
ParCIS/Chimera
Chimera: bidirectional pipeline parallelism for efficiently training large-scale models.
Language:Python66 1 48
ParCIS/Ok-Topk
Ok-Topk is a scheme for distributed training with sparse gradients. Ok-Topk integrates a novel sparse allreduce algorithm (less than 6k communication volume which is asymptotically optimal) with the decentralized parallel Stochastic Gradient Descent (SGD) optimizer, and its convergence is proved theoretically and empirically.
Language:Python25 2 48
ParCIS/FlashSparse
FlashSparse significantly reduces the computation redundancy for unstructured sparsity (for SpMM and SDDMM) on Tensor Cores through a Swap-and-Transpose mapping strategy. FlashSparse is accepted by PPoPP 2025.
Language:Cuda10 1 03
ParCIS/DNN-cpp-proxies
C++/MPI proxies for distributed training of deep neural networks.
Language:C++1 2 00