carryyu

lzy

Pinned Repositories

AI-4K
Language:Python10
ailearning
AiLearning：数据分析+机器学习实战+线性代数+PyTorch+NLTK+TF2
Language:Python00
Bilibili-plus
课程视频、PPT和源代码：侯捷C++系列；台大郭彦甫MATLAB
Language:C++00
cmake-examples-Chinese
快速入门CMake,通过例程学习语法。在线阅读地址：https://sfumecjf.github.io/cmake-examples-Chinese/
Language:C++00
code-samples
Source code examples from the Parallel Forall Blog
Language:HTML00
community
PaddlePaddle Developer Community
00
CUDA-PPT
00
CUDA_Freshman
Language:Cuda00
cutlass
CUDA Templates for Linear Algebra Subroutines
Language:C++00
hello-algo
《Hello 算法》：动画图解、一键运行的数据结构与算法教程，支持 Java, C++, Python, Go, JS, TS, C#, Swift, Rust, Dart, Zig 等语言。
Language:Java10

carryyu's Repositories

carryyu/AI-4K
Language:Python10
carryyu/hello-algo
《Hello 算法》：动画图解、一键运行的数据结构与算法教程，支持 Java, C++, Python, Go, JS, TS, C#, Swift, Rust, Dart, Zig 等语言。
Language:Java10
carryyu/ailearning
AiLearning：数据分析+机器学习实战+线性代数+PyTorch+NLTK+TF2
Language:Python00
carryyu/Bilibili-plus
课程视频、PPT和源代码：侯捷C++系列；台大郭彦甫MATLAB
Language:C++00
carryyu/cmake-examples-Chinese
快速入门CMake,通过例程学习语法。在线阅读地址：https://sfumecjf.github.io/cmake-examples-Chinese/
Language:C++00
carryyu/code-samples
Source code examples from the Parallel Forall Blog
Language:HTML00
carryyu/community
PaddlePaddle Developer Community
00
carryyu/CUDA-PPT
00
carryyu/CUDA_Freshman
Language:Cuda00
carryyu/cutlass
CUDA Templates for Linear Algebra Subroutines
Language:C++00
carryyu/docs
Documentations for PaddlePaddle
Language:Python00
carryyu/FasterTransformer
Transformer related optimization, including BERT, GPT
Language:C++00
carryyu/flash-attention
Fast and memory-efficient exact attention
Language:Python
carryyu/flashinfer
FlashInfer: Kernel Library for LLM Serving
carryyu/how-to-optim-algorithm-in-cuda
how to optimize some algorithm in cuda.
carryyu/How_to_optimize_in_GPU
This is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several basic kernel optimizations, including: elementwise, reduce, sgemv, sgemm, etc. The performance of these kernels is basically at or near the theoretical limit.
carryyu/interview
📚 C/C++
carryyu/openmlsys-zh
《Machine Learning Systems: Design and Implementation》- Chinese Version
carryyu/Paddle
PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice （『飞桨』核心框架，深度学习&机器学习高性能单机、分布式训练和跨平台部署）
Language:C++
carryyu/Paddle-Inference-Demo
carryyu/PaddleFleetX
Paddle Distributed Training Examples. 飞桨分布式训练示例 Resnet Bert GPT MOE DataParallel ModelParallel PipelineParallel HybridParallel AutoParallel Zero Sharding Recompute GradientMerge Offload AMP DGC LocalSGD Wide&Deep
Language:Python
carryyu/PaddleHub
Awesome pre-trained models toolkit based on PaddlePaddle. (400+ models including Image, Text, Audio, Video and Cross-Modal with Easy Inference & Serving)
carryyu/PaddleNLP
👑 Easy-to-use and powerful NLP library with 🤗 Awesome model zoo, supporting wide-range of NLP tasks from research to industrial applications, including 🗂Text Classification, 🔍 Neural Search, ❓ Question Answering, ℹ️ Information Extraction, 📄 Document Intelligence, 💌 Sentiment Analysis etc.
Language:Python
carryyu/ppl.nn
A primitive library for neural network
Language:C++
carryyu/stable-diffusion-webui
Stable Diffusion web UI
carryyu/TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
carryyu/XJTU-thesis
西安交通大学学位论文模板（LaTeX）（适用硕士、博士学位）An official LaTeX template for Xi'an Jiaotong University degree thesis (Chinese and English)

carryyu

Pinned Repositories

AI-4K

ailearning

Bilibili-plus

cmake-examples-Chinese

code-samples

community

CUDA-PPT

CUDA_Freshman

cutlass

hello-algo

carryyu's Repositories

carryyu/AI-4K

carryyu/hello-algo

carryyu/ailearning

carryyu/Bilibili-plus

carryyu/cmake-examples-Chinese

carryyu/code-samples

carryyu/community

carryyu/CUDA-PPT

carryyu/CUDA_Freshman

carryyu/cutlass

carryyu/docs

carryyu/FasterTransformer

carryyu/flash-attention

carryyu/flashinfer

carryyu/how-to-optim-algorithm-in-cuda

carryyu/How_to_optimize_in_GPU

carryyu/interview

carryyu/openmlsys-zh

carryyu/Paddle

carryyu/Paddle-Inference-Demo

carryyu/PaddleFleetX

carryyu/PaddleHub

carryyu/PaddleNLP

carryyu/ppl.nn

carryyu/stable-diffusion-webui

carryyu/TensorRT-LLM

carryyu/XJTU-thesis