Pinned Repositories
Add_test
caffe
Caffe: a fast open framework for deep learning.
Camp
飞桨护航计划集训营
CHIP-KNN
[FPT'20] CHIP-KNN: Configurable and HIgh-Performance K-Nearest Neighbors Accelerator on Cloud FPGAs
cifar10-HLS
cifar10数据集CNN的HLS前向传播
docs
Documentations for PaddlePaddle
KuiperInfer
带你从零实现一个高性能的深度学习推理库,Implement a high-performance deep learning inference library step by step
Paddle
PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)
particle_transformer
Official implementation of "Particle Transformer for Jet Tagging".
ParticleNet
Implementation of the jet classification network in ParticleNet: Jet Tagging via Particle Clouds
zyt1024's Repositories
zyt1024/Add_test
zyt1024/Camp
飞桨护航计划集训营
zyt1024/CHIP-KNN
[FPT'20] CHIP-KNN: Configurable and HIgh-Performance K-Nearest Neighbors Accelerator on Cloud FPGAs
zyt1024/cifar10-HLS
cifar10数据集CNN的HLS前向传播
zyt1024/docs
Documentations for PaddlePaddle
zyt1024/KuiperInfer
带你从零实现一个高性能的深度学习推理库,Implement a high-performance deep learning inference library step by step
zyt1024/Paddle
PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)
zyt1024/particle_transformer
Official implementation of "Particle Transformer for Jet Tagging".
zyt1024/WbeeServer
zyt1024/XRT
Xilinx Run Time for FPGA
zyt1024/CNN_HLS_FPGA
使用HLS实现CNN
zyt1024/examples
A set of examples around pytorch in Vision, Text, Reinforcement Learning, etc.
zyt1024/fastllm
纯c++的全平台llm加速库,支持python调用,chatglm-6B级模型单卡可达10000+token / s,支持glm, llama, moss基座,手机端流畅运行
zyt1024/hls4ml
Machine learning on FPGAs using HLS
zyt1024/How_to_optimize_in_GPU
This is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several basic kernel optimizations, including: elementwise, reduce, sgemv, sgemm, etc. The performance of these kernels is basically at or near the theoretical limit.
zyt1024/myxv6
zyt1024/ncnn
ncnn is a high-performance neural network inference framework optimized for the mobile platform
zyt1024/PaddleTest
PaddlePaddle TestSuite
zyt1024/portal-frontend
The dataset metadata sharing platform frontend
zyt1024/server
MariaDB server is a community developed fork of MySQL server. Started by core members of the original MySQL team, MariaDB actively works with outside developers to deliver the most featureful, stable, and sanely licensed open SQL server in the industry.
zyt1024/TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
zyt1024/TestGit
zyt1024/tgi_zyt
Large Language Model Text Generation Inference
zyt1024/Vitis-AI
Vitis AI is Xilinx’s development stack for AI inference on Xilinx hardware platforms, including both edge devices and Alveo cards.
zyt1024/Vitis-HLS-Introductory-Examples
zyt1024/Vitis-Tutorials
Vitis In-Depth Tutorials
zyt1024/Vitis_Accel_Examples
Vitis_Accel_Examples
zyt1024/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
zyt1024/weaver-core
Streamlined neural network training.
zyt1024/zyt1024
readme