Pinned Repositories
-LSTM-
用LSTM进行文本的情感分析
2018AICITY_LasPalmas
3D-detection-with-monocular-RGB-image
3D detection prediction in Autopilot scene with monocular RGB image. Selected Faster RCNN as basemodel and ResNet as backbone with Python.
3DDFA
The pytorch improved re-implementation of TPAMI 2017 paper: Face Alignment in Full Pose Range: A 3D Total Solution.
AB3DMOT
(IROS 2020, ECCVW 2020) Official Python Implementation for "3D Multi-Object Tracking: A Baseline and New Evaluation Metrics"
CUDA_Freshman
Interview_Notes-Chinese
2018/2019/校招/春招/秋招/自然语言处理(NLP)/深度学习(Deep Learning)/机器学习(Machine Learning)/C/C++/Python/面试笔记
modern-cpp-tutorial
📚 Modern C++ Tutorial: C++11/14/17/20 On the Fly | https://changkun.de/modern-cpp/
zhouleidcc's Repositories
zhouleidcc/AISystem
AISystem 主要是指AI系统,包括AI芯片、AI编译器、AI推理和训练框架等AI全栈底层技术
zhouleidcc/ao
Native PyTorch library for quantization and sparsity
zhouleidcc/CenterPoint_train
zhouleidcc/cuda-beginner-course-cpp-version
bilibili视频【CUDA 12.1 并行编程入门(C++语言版)】配套代码
zhouleidcc/cugraph
cuGraph - RAPIDS Graph Analytics Library
zhouleidcc/cute-gemm
zhouleidcc/Cute-Gemm-Optimization
zhouleidcc/cutlass-b2bgemm
an extension to the cutlass half-precision b2b gemm example
zhouleidcc/Cutlass_EX
study of cutlass
zhouleidcc/cutlass_performance_profiling
Exploration of GEMM Performance Improvement with CUTLASS
zhouleidcc/CutlassProgramming_learning
zhouleidcc/google-research
Google Research
zhouleidcc/gpu-toolkit
🦚 🧰 Collection of basic GPU algorithms implemented in CUDA C++.
zhouleidcc/ImmortalTracker-for-CTRL
zhouleidcc/llm.c
LLM training in simple, raw C/CUDA
zhouleidcc/MirrorSite
镜像网站合集
zhouleidcc/mlir-hello
MLIR Sample dialect
zhouleidcc/mlir-tutorial
zhouleidcc/mlir-tutorial_cn
Hands-On Practical MLIR Tutorial
zhouleidcc/muda
μ-Cuda, yet another painless cuda programming paradigm. With features: intellisense-friendly, structured launch, automatic cuda graph generation and updating.
zhouleidcc/onnx-modifier
A tool to modify ONNX models in a visualization fashion, based on Netron and Flask.
zhouleidcc/pymlir
Python interface for MLIR - the Multi-Level Intermediate Representation
zhouleidcc/resource-stream
CUDA related news and material links
zhouleidcc/rocMLIR
zhouleidcc/SHARK-Turbine
Unified compiler/runtime for interfacing with PyTorch Dynamo.
zhouleidcc/SST
Codes for “Fully Sparse 3D Object Detection” & “Embracing Single Stride 3D Object Detector with Sparse Transformer”
zhouleidcc/TensorRT
NVIDIA® TensorRT™, an SDK for high-performance deep learning inference, includes a deep learning inference optimizer and runtime that delivers low latency and high throughput for inference applications.
zhouleidcc/TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
zhouleidcc/torch-xla-SPMD
Pytorch/XLA SPMD Test code in Google TPU
zhouleidcc/torchsparse
[MLSys'22] TorchSparse: Efficient Point Cloud Inference Engine