Pinned Repositories
transformers-flashattention
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
ufbp
Udp File Broadcasting/Multicasting Protocol
bert
TensorFlow code and pre-trained models for BERT
books
c3p0
a mature, highly concurrent JDBC Connection pooling library, with support for caching and reuse of PreparedStatements.
cabinet
cook
a build tool try to cook everything
cpu_bitonicsort
cuda_matmul
cuda_reduce
bryanzhang's Repositories
bryanzhang/LOBS_v2
bryanzhang/layernorm_cpu
bryanzhang/cuda_softmax
bryanzhang/hessian_torch_kernels
Boosted sparse hessian(or it pseduo-inverse) matmul implementation.
bryanzhang/LOBS_v1
layerwise optimal brain surgeon v1
bryanzhang/LOBS_v0
bryanzhang/cpu_bitonicsort
bryanzhang/cuda_matmul
bryanzhang/high_order_grads
求解高阶导数
bryanzhang/optimal-brain-damage
Implementation of the Optimal Brain Damage paper in PyTorch (using second order Taylor series to prune neural networks)
bryanzhang/obd_V2
optimal brain damage pytorch implement v2
bryanzhang/obd
optimal brain damage pytorch implement
bryanzhang/cuda_reduce
bryanzhang/cuda_transpose
bryanzhang/transpose
矩阵转置优化
bryanzhang/triton_fusedattention
使用triton实现FlashAttention V2推理功能(FP16).
bryanzhang/simple_skip_list
简单的跳表实现.
bryanzhang/matmul
bryanzhang/transformers-flashattention
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
bryanzhang/smoothquant
SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models
bryanzhang/FasterTransformer
Transformer related optimization, including BERT, GPT
bryanzhang/fastertransformer_backend
用于调试观察
bryanzhang/threadstacks
ThreadStacks can be used to inspect thread stacktraces of live C/C++ processes.
bryanzhang/libbacktrace
A C library that may be linked into a C/C++ program to produce symbolic backtraces
bryanzhang/gptj-chatbot
bryanzhang/kongmingqi
孔明棋强化学习
bryanzhang/bert
TensorFlow code and pre-trained models for BERT
bryanzhang/books
bryanzhang/ufbp
Udp File Broadcasting/Multicasting Protocol
bryanzhang/c3p0
a mature, highly concurrent JDBC Connection pooling library, with support for caching and reuse of PreparedStatements.