KaneHui

@lepton.aiHangzhou, Zhejiang Province, China

Pinned Repositories

AArch64-Explore
Language:Mathematica00
applegpu
Apple G13 GPU architecture docs and tools
Language:HTML00
ArchProbe
A profiler to disclose and quantify hardware features on GPUs.
Language:C++00
asitop
Perf monitoring CLI tool for Apple Silicon
Language:Python00
AudioGPT
AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head
Language:Python00
BasicCUDA
Language:Cuda00
cformers
SoTA Transformers with C-backend for fast inference on your CPU.
Language:C00
DL-Framework-Tutorial
Create a Deep Learning Framework from scratch
00
gpt4-pdf-chatbot-langchain
GPT4 & LangChain Chatbot for large PDF docs
Language:TypeScript00
nnlib
cloned from https://source.codeaurora.org/quic/hexagon_nn/nnlib/
Language:C1 1 01

KaneHui's Repositories

KaneHui/AArch64-Explore
Language:Mathematica00
KaneHui/applegpu
Apple G13 GPU architecture docs and tools
Language:HTML00
KaneHui/ArchProbe
A profiler to disclose and quantify hardware features on GPUs.
Language:C++00
KaneHui/asitop
Perf monitoring CLI tool for Apple Silicon
Language:Python00
KaneHui/AudioGPT
AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head
Language:Python00
KaneHui/BasicCUDA
Language:Cuda00
KaneHui/cformers
SoTA Transformers with C-backend for fast inference on your CPU.
Language:C00
KaneHui/gpt4-pdf-chatbot-langchain
GPT4 & LangChain Chatbot for large PDF docs
Language:TypeScript00
KaneHui/HelloSilicon
An introduction to ARM64 assembly on Apple Silicon Macs
Language:Assembly00
KaneHui/incubator-tvm
Open deep learning compiler stack for cpu, gpu and specialized accelerators
Language:Python0 0 00
KaneHui/insn_bench_aarch64
Instruction latency & throughput profiler for AArch64
Language:C++00
KaneHui/llama.cpp
Port of Facebook's LLaMA model in C/C++
Language:C00
KaneHui/llvm-project
The LLVM Project is a collection of modular and reusable compiler and toolchain technologies. Note: the repository does not accept github pull requests at this moment. Please submit your patches at http://reviews.llvm.org.
0 0 00
KaneHui/LLVM_for_cpu0
This is a tutorial to learn LLVM, I realize a backend to compiler machine code for cpu0 which is a simple RISC cpu.
Language:C++00
KaneHui/ml-compiler-opt
Infrastructure for Machine Learning Guided Optimization (MLGO) in LLVM.
Language:Python00
KaneHui/MMdnn
MMdnn is a set of tools to help users inter-operate among different deep learning frameworks. E.g. model conversion and visualization. Convert models between Caffe, Keras, MXNet, Tensorflow, CNTK, PyTorch Onnx and CoreML.
Language:Python00
KaneHui/GPTQ-for-LLaMa
4 bits quantization of LLaMA using GPTQ
KaneHui/gpu-benches
collection of benchmarks to measure basic GPU capabilities
KaneHui/langchain
⚡ Building applications with LLMs through composability ⚡
KaneHui/llm-viz
3D Visualization of an GPT-style LLM
KaneHui/Medusa
Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads
Language:Jupyter Notebook
KaneHui/MNN
MNN is a blazing fast, lightweight deep learning framework, battle-tested by business-critical use cases in Alibaba
KaneHui/MOSS
An open-source tool-augmented conversational language model from Fudan University
KaneHui/netron
Visualizer for neural network, deep learning and machine learning models
Language:JavaScript0 0
KaneHui/NVIDIA_SGEMM_PRACTICE
Step-by-step optimization of CUDA SGEMM
Language:Cuda
KaneHui/SGEMM_CUDA
Fast CUDA matrix multiplication from scratch
Language:Cuda
KaneHui/stanford_alpaca
Code and documentation to train Stanford's Alpaca models, and generate the data.
Language:Python0 0
KaneHui/stf
Control and manage Android devices from your browser.
KaneHui/XiangShan
Open-source high-performance RISC-V processor
Language:Scala
KaneHui/XiangShan-doc
Documentation for XiangShan
Language:TeX0 0