Pinned Repositories
Ansor-AF-DS
This repository contains the figures, tables and source code in the ICS'24 paper: "Accelerated Auto-Tuning of GPU Kernels for Tensor Computations".
1point3acres
一亩三分地论坛 自动签到、答题
ASPLOS_artifact
Auto-Tuning
An auto-Tuning script for OpenBLAS
auto_feed_js
PT站一键转载脚本
awesome-machine-learning-in-compilers
Must read research papers and links to tools and datasets that are related to using machine learning for compilers and systems optimisation
Compiler-experiment
编译原理实验内容,包括词法分析器、递归下降法和预测分析法的语法分析器。使用C++编写
libxsmm-and-JIT
some notes of libxsmm and JIT
mtx-Col-to-Row
mtx file Col-major to Row-major
OpenBLAS_Kunpeng
This is a fake OpenBLAS. We are going to add some BLAS-like extension to it.
Lurkrazy's Repositories
Lurkrazy/mtx-Col-to-Row
mtx file Col-major to Row-major
Lurkrazy/1point3acres
一亩三分地论坛 自动签到、答题
Lurkrazy/ASPLOS_artifact
Lurkrazy/awesome-machine-learning-in-compilers
Must read research papers and links to tools and datasets that are related to using machine learning for compilers and systems optimisation
Lurkrazy/awesome-tensor-compilers
A list of awesome compiler projects and papers for tensor computation and deep learning.
Lurkrazy/Beijing-IPTV
最好用的北京联通IPTV频道列表。https://bjiptv.eu.org/
Lurkrazy/OpenBLAS-merge
OpenBLAS is an optimized BLAS library based on GotoBLAS2 1.13 BSD version.
Lurkrazy/chatgpt-html
chatgpt html online
Lurkrazy/copy-translator
简单、轻量、好用的划词翻译软件
Lurkrazy/cuda_sgemm
Lurkrazy/EOP
VEE 22: Efficient Operator Partition for Deep Learning Inference Over Edge Servers
Lurkrazy/frps-onekey
Lurkrazy/go-shadowsocks2
Modern Shadowsocks in Go
Lurkrazy/How_to_optimize_in_GPU
This is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several basic kernel optimizations, including: elementwise, reduce, sgemv, sgemm, etc. The performance of these kernels is basically at or near the theoretical limit.
Lurkrazy/interview-english
English for Tech Interview 面试中的英语
Lurkrazy/LibShalom
Lurkrazy/lurkrazy.github.io
A jekyll based resume template
Lurkrazy/models
Model Zoo for Intel® Architecture: contains Intel optimizations for running deep learning workloads on Intel® Xeon® Scalable processors
Lurkrazy/MYLIB
Lurkrazy/myTLCBench
Lurkrazy/OpBench
based on TVM. profiling op performance with many features.
Lurkrazy/Optimizing-SGEMM-on-NVIDIA-Turing-GPUs
Optimizing SGEMM kernel functions on NVIDIA GPUs to a close-to-cuBLAS performance.
Lurkrazy/p1a3_script
Tampermonkey Script for 1point3acres / 一亩三分地的油猴脚本
Lurkrazy/PyDTNN
PyDTNN - Python Distributed Training of Neural Networks
Lurkrazy/removed-2022-07-12
Lurkrazy/testCuda
test cuda environment
Lurkrazy/TLCBench
Benchmark scripts for TVM
Lurkrazy/Traduzir-paginas-web
Translate your page in real time using Google or Yandex
Lurkrazy/tvm
Open deep learning compiler stack for cpu, gpu and specialized accelerators
Lurkrazy/wowchemy-hugo-themes
🔥 Hugo website builder, Hugo themes & Hugo CMS. No code, build with widgets! 创建在线课程,学术简历或初创网站。