zhuochenKIDD

Beijing

Pinned Repositories

AITemplate
AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (NVIDIA GPU) and MatrixCore (AMD GPU) inference.
Language:Python0 1 00
annotated-transformer
An annotated implementation of the Transformer paper.
Language:Jupyter Notebook0 1 00
Auto-GPT
An experimental open-source attempt to make GPT-4 fully autonomous.
Language:Python0 1 00
ColossalAI
Colossal-AI: A Unified Deep Learning System for Big Model Era
Language:Python0 1 00
CppCoreGuidelines
The C++ Core Guidelines are a set of tried-and-true guidelines, rules, and best practices about coding in C++
Language:Python0 1 00
cuda-samples
Samples for CUDA Developers which demonstrates features in CUDA Toolkit
Language:C0 1 00
cutlass
CUDA Templates for Linear Algebra Subroutines
Language:C++0 1 00
DeepLearningExamples
Deep Learning Examples
Language:Python0 1 00
DeepLearningNotes
机器学习和量化分析学习进行中
Language:Python0 3 01

zhuochenKIDD's Repositories

zhuochenKIDD/annotated-transformer
An annotated implementation of the Transformer paper.
Language:Jupyter Notebook0 1 00
zhuochenKIDD/cuda-samples
Samples for CUDA Developers which demonstrates features in CUDA Toolkit
Language:C0 1 00
zhuochenKIDD/cutlass
CUDA Templates for Linear Algebra Subroutines
Language:C++0 1 00
zhuochenKIDD/DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Language:Python0 1 00
zhuochenKIDD/gemm-study
Language:C++0 2 00
zhuochenKIDD/llvm-project
The LLVM Project is a collection of modular and reusable compiler and toolchain technologies. Note: the repository does not accept github pull requests at this moment. Please submit your patches at http://reviews.llvm.org.
0 1 00
zhuochenKIDD/gpt-quant
2 0
zhuochenKIDD/gpt4all
gpt4all: open-source LLM chatbots that you can run anywhere
Language:C++1 0
zhuochenKIDD/iree
A retargetable MLIR-based machine learning compiler and runtime toolkit.
Language:C++1 0
zhuochenKIDD/jax
Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more
Language:Python1 0
zhuochenKIDD/llama.cpp
Port of Facebook's LLaMA model in C/C++
Language:C1 0
zhuochenKIDD/micrograd
A tiny scalar-valued autograd engine and a neural net library on top of it with PyTorch-like API
Language:Jupyter Notebook0 0
zhuochenKIDD/milvus
A cloud-native vector database, storage for next generation AI applications
Language:Go1 0
zhuochenKIDD/onnx
Open standard for machine learning interoperability
Language:Python1 0
zhuochenKIDD/onnx-simplifier
Simplify your onnx model
Language:C++1 0
zhuochenKIDD/openvino
OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference
Language:C++1 0
zhuochenKIDD/pytorch
Tensors and Dynamic neural networks in Python with strong GPU acceleration
Language:Python1 0
zhuochenKIDD/pytorch-utils
A newbie to PyTorch
2 0
zhuochenKIDD/tensorflow
An Open Source Machine Learning Framework for Everyone
Language:C++3 02
zhuochenKIDD/tensorflow-onnx
Convert TensorFlow, Keras, Tensorflow.js and Tflite models to ONNX
Language:Jupyter Notebook1 0
zhuochenKIDD/TensorRT
NVIDIA® TensorRT™, an SDK for high-performance deep learning inference, includes a deep learning inference optimizer and runtime that delivers low latency and high throughput for inference applications.
Language:C++1 0
zhuochenKIDD/TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
Language:C++1 0
zhuochenKIDD/text-generation-inference
Large Language Model Text Generation Inference
Language:Python1 0
zhuochenKIDD/tf-utils
Arsenal for TensorFlow models manipulation
Language:Python3 0
zhuochenKIDD/tinygrad
You like pytorch? You like micrograd? You love tinygrad! ❤️
Language:Python1 0
zhuochenKIDD/triton
Development repository for the Triton language and compiler
Language:C++1 0
zhuochenKIDD/tvm
Open deep learning compiler stack for cpu, gpu and specialized accelerators
Language:Python1 0
zhuochenKIDD/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Language:Python1 0
zhuochenKIDD/vnpy
基于Python的开源量化交易平台开发框架
Language:Python1 0
zhuochenKIDD/xla
A machine learning compiler for GPUs, CPUs, and ML accelerators
Language:C++1 0