Pinned Repositories
-LSTM-
用LSTM进行文本的情感分析
2018AICITY_LasPalmas
3D-detection-with-monocular-RGB-image
3D detection prediction in Autopilot scene with monocular RGB image. Selected Faster RCNN as basemodel and ResNet as backbone with Python.
3DDFA
The pytorch improved re-implementation of TPAMI 2017 paper: Face Alignment in Full Pose Range: A 3D Total Solution.
AB3DMOT
(IROS 2020, ECCVW 2020) Official Python Implementation for "3D Multi-Object Tracking: A Baseline and New Evaluation Metrics"
CUDA_Freshman
Interview_Notes-Chinese
2018/2019/校招/春招/秋招/自然语言处理(NLP)/深度学习(Deep Learning)/机器学习(Machine Learning)/C/C++/Python/面试笔记
modern-cpp-tutorial
📚 Modern C++ Tutorial: C++11/14/17/20 On the Fly | https://changkun.de/modern-cpp/
zhouleidcc's Repositories
zhouleidcc/AISystem
AISystem 主要是指AI系统,包括AI芯片、AI编译器、AI推理和训练框架等AI全栈底层技术
zhouleidcc/ao
Native PyTorch library for quantization and sparsity
zhouleidcc/cuda-beginner-course-cpp-version
bilibili视频【CUDA 12.1 并行编程入门(C++语言版)】配套代码
zhouleidcc/cuda_hgemm
Several optimization methods of half-precision general matrix multiplication (HGEMM) using tensor core with WMMA API and MMA PTX instruction.
zhouleidcc/CudaSteps
基于《cuda编程-基础与实践》(樊哲勇 著)的cuda学习之路。
zhouleidcc/cugraph
cuGraph - RAPIDS Graph Analytics Library
zhouleidcc/cute-gemm
zhouleidcc/Cute-Gemm-Optimization
zhouleidcc/cutlass_performance_profiling
Exploration of GEMM Performance Improvement with CUTLASS
zhouleidcc/CutlassProgramming_learning
zhouleidcc/google-research
Google Research
zhouleidcc/gpu-toolkit
🦚 🧰 Collection of basic GPU algorithms implemented in CUDA C++.
zhouleidcc/hackhackAwesome-Hacking
A collection of various awesome lists for hackers, pentesters and security researchers
zhouleidcc/hackhackawesome-web-hacking
A list of web application security
zhouleidcc/llm.c
LLM training in simple, raw C/CUDA
zhouleidcc/MirrorSite
镜像网站合集
zhouleidcc/mlir-hello
MLIR Sample dialect
zhouleidcc/mlir-tutorial
zhouleidcc/mlir-tutorial_cn
Hands-On Practical MLIR Tutorial
zhouleidcc/only_train_once
OTOv1-v3, NeurIPS, ICLR, TMLR, DNN Training, Compression, Structured Pruning, Erasing Operators, CNN, Diffusion, LLM
zhouleidcc/onnx-modifier
A tool to modify ONNX models in a visualization fashion, based on Netron and Flask.
zhouleidcc/PTX-ISA-chinese
CUDA PTX-ISA Document 中文翻译版
zhouleidcc/pymlir
Python interface for MLIR - the Multi-Level Intermediate Representation
zhouleidcc/resource-stream
CUDA related news and material links
zhouleidcc/riscv-v-spec
Working draft of the proposed RISC-V V vector extension
zhouleidcc/rocMLIR
zhouleidcc/SHARK-Turbine
Unified compiler/runtime for interfacing with PyTorch Dynamo.
zhouleidcc/TensorRT
NVIDIA® TensorRT™, an SDK for high-performance deep learning inference, includes a deep learning inference optimizer and runtime that delivers low latency and high throughput for inference applications.
zhouleidcc/TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
zhouleidcc/torch-xla-SPMD
Pytorch/XLA SPMD Test code in Google TPU