Pinned Repositories
eval_voc
eval voc data use python
flash-attention-minimal
Flash Attention in ~100 lines of CUDA (forward pass only)
FPN_pytorch
Implement FPN with pytorch
Leetcode
Play Leetcode with different programming language
mynet
myos
DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
cutlass
CUDA Templates for Linear Algebra Subroutines
recommenders-addons
Additional utils and helpers to extend TensorFlow when build recommendation systems, contributed and maintained by SIG Recommenders.
tensorflow
An Open Source Machine Learning Framework for Everyone
luliyucoordinate's Repositories
luliyucoordinate/Leetcode
Play Leetcode with different programming language
luliyucoordinate/myos
luliyucoordinate/cute-flash-attention
Implement Flash Attention using Cute.
luliyucoordinate/mynet
luliyucoordinate/flash-attention-minimal
Flash Attention in ~100 lines of CUDA (forward pass only)
luliyucoordinate/Python2048
Use python to implement 2048 game
luliyucoordinate/CUDA-GEMM-Optimization
CUDA Matrix Multiplication Optimization
luliyucoordinate/ebook
classic books of computer science!
luliyucoordinate/play-linux
luliyucoordinate/tiny-triton
luliyucoordinate/acwing
my acwing template
luliyucoordinate/CoreFusionGEMM
luliyucoordinate/cute-gemm
luliyucoordinate/cutlass
CUDA Templates for Linear Algebra Subroutines
luliyucoordinate/DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
luliyucoordinate/flatbuffers
FlatBuffers: Memory Efficient Serialization Library
luliyucoordinate/glog
C++ implementation of the Google logging module
luliyucoordinate/HP-CPP
luliyucoordinate/HunyuanDiT
Hunyuan-DiT : A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding
luliyucoordinate/ImageBed
image bed
luliyucoordinate/LeptTCP
luliyucoordinate/luliyucoordinate.github.io
my blog
luliyucoordinate/MyDisk
Distributed Cloud Disk
luliyucoordinate/mynet-test
luliyucoordinate/recommenders-addons
Additional utils and helpers to extend TensorFlow when build recommendation systems, contributed and maintained by SIG Recommenders.
luliyucoordinate/tensorflow
An Open Source Machine Learning Framework for Everyone
luliyucoordinate/TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
luliyucoordinate/ThunderKittens
Tile primitives for speedy kernels
luliyucoordinate/Tiny-Go-Crawler
luliyucoordinate/YHs_Sample
Yinghan's Code Sample