foreverlms

/**************/

BytedanceShanghai, China

Pinned Repositories

100-shell-script-examples
Collection of shell scripts found on the internet
Language:Shell0 1 00
abc
code snippets
Language:Jupyter Notebook0 2 00
Android-1
Android related examples
Language:Java0 2 00
Android-App-Development
This repository contains all the source code examples and the FAQ for our Android App Development Specialization for Coursera
Language:Java0 2 00
android-fundamentals
Language:Java00
Python
学习Python过程中的练习。
Language:Python3 3 03
ROS-Learning
学习ROS过程中的练习代码
Language:CMake0 2 00
slam14
高翔博士书籍《视觉SLAM十四讲》书上练习及部分习题
Language:C++6 2 13
tingta
通过用户朋友圈分享的网易云音乐来获取网易云音乐用户名
Language:Java1 2 00
vio
从零手写VIO课程
Language:C++1 1 00

foreverlms's Repositories

foreverlms/abc
code snippets
Language:Jupyter Notebook0 2 00
foreverlms/armnn
Arm NN ML Software. The code here is a read-only mirror of https://review.mlplatform.org/admin/repos/ml/armnn
foreverlms/awesome-tensor-compilers
A list of awesome compiler projects and papers for tensor computation and deep learning.
0 0
foreverlms/cfx-article-src
foreverlms/composable_kernel
Composable Kernel: Performance Portable Programming Model for Machine Learning Tensor Operators
Language:C++
foreverlms/CUDA-Learn-Notes
📚150+ Tensor/CUDA Cores Kernels, ⚡️flash-attn-mma, ⚡️hgemm with WMMA, MMA and CuTe (98%~100% TFLOPS of cuBLAS/FA2 🎉🎉).
Language:Cuda0 0
foreverlms/cuda-training-series
Training materials associated with NVIDIA's CUDA Training Series (www.olcf.ornl.gov/cuda-training-series/)
foreverlms/cuda_sgemm
Language:Cuda0 0
foreverlms/cute-gemm-101
foreverlms/Cute-Learning
foreverlms/cutlass
CUDA Templates for Linear Algebra Subroutines
Language:C++
foreverlms/cutlass-kernels
Language:Cuda0 0
foreverlms/dev-sidecar
开发者边车，github打不开，github加速，git clone加速，git release下载加速，stackoverflow加速
Language:JavaScript1 0
foreverlms/flash-attention
Fast and memory-efficient exact attention
foreverlms/flashinfer
FlashInfer: Kernel Library for LLM Serving
Language:Cuda
foreverlms/folly
An open-source C++ library developed and used at Facebook.
Language:C++1 0
foreverlms/foreverlms.github.io
个人博客，参考的模板是izhengfan.github.io
Language:HTML1 01
foreverlms/gdb-dashboard
Modular visual interface for GDB in Python
Language:Python1 0
foreverlms/INT8-Flash-Attention-FMHA-Quantization
foreverlms/lightllm
LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.
foreverlms/llm-numbers
Numbers every LLM developer should know
foreverlms/llvm-project
The LLVM Project is a collection of modular and reusable compiler and toolchain technologies. Note: the repository does not accept github pull requests at this moment. Please submit your patches at http://reviews.llvm.org.
0 0
foreverlms/maxas
Assembler for NVIDIA Maxwell architecture
Language:Sass0 0
foreverlms/MegEngine
MegEngine 是一个快速、可拓展、易于使用且支持自动求导的深度学习框架
Language:C++0 0
foreverlms/MegPeak
Language:C++0 0
foreverlms/MNN
MNN is a blazing fast, lightweight deep learning framework, battle-tested by business-critical use cases in Alibaba
Language:C++0 02
foreverlms/perf-ninja
This is an online course where you can learn and master the skill of low-level performance analysis and tuning.
Language:C++0 0
foreverlms/pytorch
Tensors and Dynamic neural networks in Python with strong GPU acceleration
Language:Python1 0
foreverlms/TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
Language:C++
foreverlms/ZhiLight
A highly optimized inference acceleration engine for Llama and its variants.