zmxdream
2021.7~2023.9 PaddlePaddle分布式组 2023.10~2024.6 百度凤巢模型训练组 PaddlePadle/GPUPS/XPUPS/PaddleBox
BaiduBeijing,China
Pinned Repositories
Paddle
PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)
caffe
Caffe on both Linux and Windows
coroutine
A asymmetric coroutine library for C.
DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective.
FeatGraph
HugeCTR
HugeCTR is a high efficiency GPU framework designed for Click-Through-Rate (CTR) estimating training
NCE-CNN-Torch
Noise-Contrastive Estimation for Question Answering with Convolutional Neural Networks (Rao et al. CIKM 2016)
Paddle
PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)
tensorflow
An Open Source Machine Learning Framework for Everyone
x-deeplearning
An industrial deep learning framework for high-dimension sparse data
zmxdream's Repositories
zmxdream/FeatGraph
zmxdream/HugeCTR
HugeCTR is a high efficiency GPU framework designed for Click-Through-Rate (CTR) estimating training
zmxdream/Paddle
PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)
zmxdream/tensorflow
An Open Source Machine Learning Framework for Everyone
zmxdream/abseil-cpp
Abseil Common Libraries (C++)
zmxdream/asyncplusplus
Async++ concurrency framework for C++11
zmxdream/ByteTransformer
optimized BERT transformer inference on NVIDIA GPU. https://arxiv.org/abs/2210.03052
zmxdream/Clustered-Embedding-Learning
Code for the paper "Clustered Embedding Learning for Recommender Systems"
zmxdream/cub
Cooperative primitives for CUDA C++.
zmxdream/cuCollections
zmxdream/CUDA-Programming-Guide-in-Chinese
This is a Chinese translation of the CUDA programming guide
zmxdream/docs
Documentations for PaddlePaddle
zmxdream/fairscale
PyTorch extensions for high performance and large scale training.
zmxdream/FBGEMM
FB (Facebook) + GEMM (General Matrix-Matrix Multiplication) - https://code.fb.com/ml-applications/fbgemm/
zmxdream/gcc
zmxdream/jax
Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more
zmxdream/LargeBatchCTR
Large batch training of CTR models based on DeepCTR with CowClip.
zmxdream/MEGABYTE-pytorch
Implementation of MEGABYTE, Predicting Million-byte Sequences with Multiscale Transformers, in Pytorch
zmxdream/Megatron-LM
Ongoing research training transformer models at scale
zmxdream/mini-lsm
A tutorial of building an LSM-Tree storage engine in a week!
zmxdream/nccl
Optimized primitives for collective multi-GPU communication
zmxdream/nccl-tests
NCCL Tests
zmxdream/nvidia_tensorflow
An Open Source Machine Learning Framework for Everyone
zmxdream/OptEmbed
This repository contains PyTorch Implementation of CIKM 2022 research-track paper: OptEmbed: Learning Optimal Embedding Table for Click-through Rate Prediction.
zmxdream/PaddleRec
Recommendation Algorithm大规模推荐算法库,包含推荐系统经典及最新算法LR、Wide&Deep、DSSM、TDM、MIND、Word2Vec、Bert4Rec、DeepWalk、SSR、AITM,DSIN,SIGN,IPREC、GRU4Rec、Youtube_dnn、NCF、GNN、FM、FFM、DeepFM、DCN、DIN、DIEN、DLRM、MMOE、PLE、ESMM、ESCMM, MAML、xDeepFM、DeepFEFM、NFM、AFM、RALM、DMR、GateNet、NAML、DIFM、Deep Crossing、PNN、BST、AutoInt、FGCNN、FLEN、Fibinet、ListWise、DeepRec、ENSFM,TiSAS,AutoFIS等,
zmxdream/PaddleTest
PaddlePaddle TestSuite
zmxdream/recommenders-addons
Additional utils and helpers to extend TensorFlow when build recommendation systems, contributed and maintained by SIG Recommenders.
zmxdream/triton
Development repository for the Triton language and compiler
zmxdream/xla
A machine learning compiler for GPUs, CPUs, and ML accelerators
zmxdream/zmxdream