thorneliu
CUDA | AI serving| SD | TF TRT | CTR | brpc | | LLM C++ programmer since 2012
vivoHangzhou, Zhejiang, China
Pinned Repositories
recommenders-addons
Additional utils and helpers to extend TensorFlow when build recommendation systems, contributed and maintained by SIG Recommenders.
articles
All current objc.io articles
AutoBuild-OpenWrt
Build OpenWrt using GitHub Actions | 使用 GitHub Actions 编译 OpenWrt | 感谢P3TERX的项目源码|感谢KFERMercer的项目源码
FasterTransformer
Transformer related optimization, including BERT, GPT
ggml
Tensor library for machine learning
incubator-brpc
brpc is an Industrial-grade RPC framework using C++ Language, which is often used in high performance system such as Search, Storage, Machine learning, Advertisement, Recommendation etc. "brpc" means "better RPC".
Octave_tutorial_CN
Octave中文教程 类matlab数值计算工具
recommenders-addons
Additional utils and helpers to extend TensorFlow when build recommendation systems, contributed and maintained by SIG Recommenders.
stable-fast
An ultra lightweight inference performance optimization framework for HuggingFace Diffusers on NVIDIA GPUs.
yalantinglibs
A collection of C++20 libraries, include async_simple, coro_rpc and struct_pack
thorneliu's Repositories
thorneliu/Octave_tutorial_CN
Octave中文教程 类matlab数值计算工具
thorneliu/incubator-brpc
brpc is an Industrial-grade RPC framework using C++ Language, which is often used in high performance system such as Search, Storage, Machine learning, Advertisement, Recommendation etc. "brpc" means "better RPC".
thorneliu/recommenders-addons
Additional utils and helpers to extend TensorFlow when build recommendation systems, contributed and maintained by SIG Recommenders.
thorneliu/yalantinglibs
A collection of C++20 libraries, include async_simple, coro_rpc and struct_pack
thorneliu/articles
All current objc.io articles
thorneliu/AutoBuild-OpenWrt
Build OpenWrt using GitHub Actions | 使用 GitHub Actions 编译 OpenWrt | 感谢P3TERX的项目源码|感谢KFERMercer的项目源码
thorneliu/AVX-AVX2-Example-Code
Example code for Intel AVX / AVX2 intrinsics.
thorneliu/FasterTransformer
Transformer related optimization, including BERT, GPT
thorneliu/ggml
Tensor library for machine learning
thorneliu/PaddleServing
A flexible, high-performance carrier for machine learning models(『飞桨』服务化部署框架)
thorneliu/stable-fast
An ultra lightweight inference performance optimization framework for HuggingFace Diffusers on NVIDIA GPUs.
thorneliu/AwesomeCpp
---AWESOME--- C++学习笔记和常见面试知识点,C++11特性,包括智能指针、四种强制转换、function和bind、移动语义、完美转发、tuple、多态原理、虚表、友元函数、符号重载、函数指针、深浅拷贝、struct内存对齐、volatile以及union\static等各种关键字的用法等等
thorneliu/chatglm.cpp
C++ implementation of ChatGLM-6B & ChatGLM2-6B
thorneliu/CPP-Templates-2nd--
《C++ Templates 第二版》中文翻译,和原书排版一致,第一部分(1至11章)以及第18,19,20,21、22、23、24、25章已完成,其余内容逐步更新中。 个人爱好,发现错误请指正
thorneliu/CTranslate2
Fast inference engine for Transformer models
thorneliu/CUDA_Freshman
thorneliu/cutlass
CUDA Templates for Linear Algebra Subroutines
thorneliu/flash-attention-minimal
Flash Attention in ~100 lines of CUDA (forward pass only)
thorneliu/HierarchicalKV
HierarchicalKV is a part of NVIDIA Merlin and provides hierarchical key-value storage to meet RecSys requirements. The key capability of Merlin-KV is to store key-value feature-embeddings on high-bandwidth memory (HBM) of GPUs and in host memory. It also can be used as a generic key-value storage.
thorneliu/iguana
universal serialization engine
thorneliu/Make
Cmake demo
thorneliu/muduo
Event-driven network library for multi-threaded Linux server in C++11
thorneliu/muduo-1
基于C++11的muduo网络库
thorneliu/PaddleNLP
👑 Easy-to-use and powerful NLP library with 🤗 Awesome model zoo, supporting wide-range of NLP tasks from research to industrial applications, including 🗂Text Classification, 🔍 Neural Search, ❓ Question Answering, ℹ️ Information Extraction, 📄 Document Intelligence, 💌 Sentiment Analysis and 🖼 Diffusion AIGC system etc.
thorneliu/PaddleRec
大规模推荐模型训练工具
thorneliu/ray
Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a toolkit of libraries (Ray AIR) for accelerating ML workloads.
thorneliu/tensornet
thorneliu/TFRecord-Parser
TFRecord parser using C++ and Protocal Buffer
thorneliu/ThreadPool
A simple C++11 Thread Pool implementation
thorneliu/ZLToolKit
一个基于C++11的轻量级网络框架,基于线程池技术可以实现大并发网络IO