turbo0628

HPC's Productive Computing!

taichi.graphicsShenzhen

Pinned Repositories

RWKV-CUDA
The CUDA version of the RWKV language model ( https://github.com/BlinkDL/RWKV-LM )
Language:Python01
taichi
Productive, portable, and performant GPU programming in Python.
Language:C++24.9k 390 2.6k2.3k
FeatherCNN
FeatherCNN is a high performance inference engine for convolutional neural networks.
Language:C++1.2k 101 44285
3x3_SVD_CUDA
Fast CUDA 3x3 SVD
Language:C++00
cpufp
A CPU tool for benchmarking the peak of floating points
Language:C1 1 01
light-model-transformer
Language:C++2 3 00
LSBDS
Large Scale Biology Database Search on Xeon Phi platform
1 2 00
SWhybrid
Language:C++6 3 02
Taichi-MPI
The Taichi MPI demos with MPI4Py
Language:Python11 2 02
test_feather_ncnn
The utility project to test computing results for FeatherCNN and ncnn
Language:C++1 3 01

turbo0628's Repositories

turbo0628/Taichi-MPI
The Taichi MPI demos with MPI4Py
Language:Python11 2 02
turbo0628/cpufp
A CPU tool for benchmarking the peak of floating points
Language:C1 1 01
turbo0628/blog_code
Language:Python1 0
turbo0628/cluster_a3m
Language:Python2 0
turbo0628/diff-gaussian-rasterization
turbo0628/docathon
2 02
turbo0628/esm
Evolutionary Scale Modeling (esm): Pretrained language models for proteins
Language:Python1 0
turbo0628/graphi-t
Handy tools & graphics API abstraction for blazing fast prototyping
turbo0628/jax-md
Differentiable, Hardware Accelerated, Molecular Dynamics
Language:Jupyter Notebook1 0
turbo0628/JD331
turbo0628/MAC-taichi
A MAC (Marker-And-Cell) solver written in Taichi
turbo0628/MetalBugReprod
Minimal reproduction of an Apple metal compilation bug.
Language:C++
turbo0628/mini-nbody
A simple gravitational N-body simulation in less than 100 lines of C code, with CUDA optimizations.
Language:C
turbo0628/mpm_ptx_kernels
compare the mpm kernel performance
turbo0628/PFNN_TVM
Efficient PFNN implementations enabled by TVM
2 0
turbo0628/ppl.nn
A primitive library for neural network
Language:C++1 0
turbo0628/prefix_sum_android
Language:C++1
turbo0628/quaternion
A brief introduction to the quaternions and its applications in 3D geometry.
turbo0628/rhino3dm
Libraries based on OpenNURBS with a RhinoCommon style
1
turbo0628/rosetta-json-test
The json test suite for pyrosetta compilation
Language:C++3 01
turbo0628/RWKV-CUDA
The CUDA version of the RWKV language model ( https://github.com/BlinkDL/RWKV-LM )
turbo0628/stable-fast
An ultra lightweight inference performance optimization framework for HuggingFace Diffusers on NVIDIA GPUs.
turbo0628/taichi
Productive & portable high-performance programming in Python.
Language:C++1 0
turbo0628/taichi-aot-demo
A demo illustrating how to use Taichi as an AOT shader compiler
turbo0628/taichi-benchmark
Language:Python
turbo0628/Taichi-UnityExample
turbo0628/taichi_benchmark
turbo0628/TaichiCloud
Language:Python1 0
turbo0628/uVkCompute
A micro Vulkan compute pipeline and a collection of benchmarking compute shaders
turbo0628/vram_test
Language:C++