Pinned Repositories
AI-4K
ailearning
AiLearning:数据分析+机器学习实战+线性代数+PyTorch+NLTK+TF2
Bilibili-plus
课程视频、PPT和源代码:侯捷C++系列;台大郭彦甫MATLAB
cmake-examples-Chinese
快速入门CMake,通过例程学习语法。在线阅读地址:https://sfumecjf.github.io/cmake-examples-Chinese/
code-samples
Source code examples from the Parallel Forall Blog
community
PaddlePaddle Developer Community
CUDA-PPT
CUDA_Freshman
cutlass
CUDA Templates for Linear Algebra Subroutines
hello-algo
《Hello 算法》:动画图解、一键运行的数据结构与算法教程,支持 Java, C++, Python, Go, JS, TS, C#, Swift, Rust, Dart, Zig 等语言。
carryyu's Repositories
carryyu/AI-4K
carryyu/hello-algo
《Hello 算法》:动画图解、一键运行的数据结构与算法教程,支持 Java, C++, Python, Go, JS, TS, C#, Swift, Rust, Dart, Zig 等语言。
carryyu/ailearning
AiLearning:数据分析+机器学习实战+线性代数+PyTorch+NLTK+TF2
carryyu/Bilibili-plus
课程视频、PPT和源代码:侯捷C++系列;台大郭彦甫MATLAB
carryyu/cmake-examples-Chinese
快速入门CMake,通过例程学习语法。在线阅读地址:https://sfumecjf.github.io/cmake-examples-Chinese/
carryyu/code-samples
Source code examples from the Parallel Forall Blog
carryyu/community
PaddlePaddle Developer Community
carryyu/CUDA-PPT
carryyu/CUDA_Freshman
carryyu/cutlass
CUDA Templates for Linear Algebra Subroutines
carryyu/docs
Documentations for PaddlePaddle
carryyu/FasterTransformer
Transformer related optimization, including BERT, GPT
carryyu/flash-attention
Fast and memory-efficient exact attention
carryyu/flashinfer
FlashInfer: Kernel Library for LLM Serving
carryyu/how-to-optim-algorithm-in-cuda
how to optimize some algorithm in cuda.
carryyu/How_to_optimize_in_GPU
This is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several basic kernel optimizations, including: elementwise, reduce, sgemv, sgemm, etc. The performance of these kernels is basically at or near the theoretical limit.
carryyu/interview
📚 C/C++
carryyu/openmlsys-zh
《Machine Learning Systems: Design and Implementation》- Chinese Version
carryyu/Paddle
PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)
carryyu/Paddle-Inference-Demo
carryyu/PaddleFleetX
Paddle Distributed Training Examples. 飞桨分布式训练示例 Resnet Bert GPT MOE DataParallel ModelParallel PipelineParallel HybridParallel AutoParallel Zero Sharding Recompute GradientMerge Offload AMP DGC LocalSGD Wide&Deep
carryyu/PaddleHub
Awesome pre-trained models toolkit based on PaddlePaddle. (400+ models including Image, Text, Audio, Video and Cross-Modal with Easy Inference & Serving)
carryyu/PaddleNLP
👑 Easy-to-use and powerful NLP library with 🤗 Awesome model zoo, supporting wide-range of NLP tasks from research to industrial applications, including 🗂Text Classification, 🔍 Neural Search, ❓ Question Answering, ℹ️ Information Extraction, 📄 Document Intelligence, 💌 Sentiment Analysis etc.
carryyu/ppl.nn
A primitive library for neural network
carryyu/stable-diffusion-webui
Stable Diffusion web UI
carryyu/TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
carryyu/XJTU-thesis
西安交通大学学位论文模板(LaTeX)(适用硕士、博士学位)An official LaTeX template for Xi'an Jiaotong University degree thesis (Chinese and English)