Pinned Repositories
AITemplate
AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (NVIDIA GPU) and MatrixCore (AMD GPU) inference.
annotated-transformer
An annotated implementation of the Transformer paper.
Auto-GPT
An experimental open-source attempt to make GPT-4 fully autonomous.
ColossalAI
Colossal-AI: A Unified Deep Learning System for Big Model Era
CppCoreGuidelines
The C++ Core Guidelines are a set of tried-and-true guidelines, rules, and best practices about coding in C++
cuda-samples
Samples for CUDA Developers which demonstrates features in CUDA Toolkit
cutlass
CUDA Templates for Linear Algebra Subroutines
DeepLearningExamples
Deep Learning Examples
DeepLearningNotes
机器学习和量化分析学习进行中
zhuochenKIDD's Repositories
zhuochenKIDD/annotated-transformer
An annotated implementation of the Transformer paper.
zhuochenKIDD/cuda-samples
Samples for CUDA Developers which demonstrates features in CUDA Toolkit
zhuochenKIDD/cutlass
CUDA Templates for Linear Algebra Subroutines
zhuochenKIDD/DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
zhuochenKIDD/gemm-study
zhuochenKIDD/llvm-project
The LLVM Project is a collection of modular and reusable compiler and toolchain technologies. Note: the repository does not accept github pull requests at this moment. Please submit your patches at http://reviews.llvm.org.
zhuochenKIDD/gpt-quant
zhuochenKIDD/gpt4all
gpt4all: open-source LLM chatbots that you can run anywhere
zhuochenKIDD/iree
A retargetable MLIR-based machine learning compiler and runtime toolkit.
zhuochenKIDD/jax
Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more
zhuochenKIDD/llama.cpp
Port of Facebook's LLaMA model in C/C++
zhuochenKIDD/micrograd
A tiny scalar-valued autograd engine and a neural net library on top of it with PyTorch-like API
zhuochenKIDD/milvus
A cloud-native vector database, storage for next generation AI applications
zhuochenKIDD/onnx
Open standard for machine learning interoperability
zhuochenKIDD/onnx-simplifier
Simplify your onnx model
zhuochenKIDD/openvino
OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference
zhuochenKIDD/pytorch
Tensors and Dynamic neural networks in Python with strong GPU acceleration
zhuochenKIDD/pytorch-utils
A newbie to PyTorch
zhuochenKIDD/tensorflow
An Open Source Machine Learning Framework for Everyone
zhuochenKIDD/tensorflow-onnx
Convert TensorFlow, Keras, Tensorflow.js and Tflite models to ONNX
zhuochenKIDD/TensorRT
NVIDIA® TensorRT™, an SDK for high-performance deep learning inference, includes a deep learning inference optimizer and runtime that delivers low latency and high throughput for inference applications.
zhuochenKIDD/TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
zhuochenKIDD/text-generation-inference
Large Language Model Text Generation Inference
zhuochenKIDD/tf-utils
Arsenal for TensorFlow models manipulation
zhuochenKIDD/tinygrad
You like pytorch? You like micrograd? You love tinygrad! ❤️
zhuochenKIDD/triton
Development repository for the Triton language and compiler
zhuochenKIDD/tvm
Open deep learning compiler stack for cpu, gpu and specialized accelerators
zhuochenKIDD/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
zhuochenKIDD/vnpy
基于Python的开源量化交易平台开发框架
zhuochenKIDD/xla
A machine learning compiler for GPUs, CPUs, and ML accelerators