ToExperienceMore's Stars
microsoft/DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
google-research/tuning_playbook
A playbook for systematically maximizing the performance of deep learning models.
TencentARC/PhotoMaker
PhotoMaker [CVPR 2024]
MingchaoZhu/DeepLearning
Python for《Deep Learning》,该书为《深度学习》(花书) 数学推导、原理剖析与源码级别代码实现
YaoFANGUK/video-subtitle-extractor
视频硬字幕提取,生成srt文件。无需申请第三方API,本地实现文本识别。基于深度学习的视频字幕提取框架,包含字幕区域检测、字幕内容提取。A GUI tool for extracting hard-coded subtitle (hardsub) from videos and generating srt files.
sindresorhus/create-dmg
Create a good-looking DMG for your macOS app in seconds
NervanaSystems/neon
Intel® Nervana™ reference deep learning framework committed to best performance on all hardware
sindresorhus/open
Open stuff like URLs, files, executables. Cross-platform.
intel/intel-extension-for-transformers
⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡
hollance/neural-engine
Everything we actually know about the Apple Neural Engine (ANE)
sindresorhus/KeyboardShortcuts
⌨️ Add user-customizable global keyboard shortcuts (hotkeys) to your macOS app in minutes
BBuf/how-to-optim-algorithm-in-cuda
how to optimize some algorithm in cuda.
ChunelFeng/CGraph
【A common used C++ DAG framework】 一个通用的、无三方依赖的、跨平台的、收录于awesome-cpp的、基于流图的并行计算框架。欢迎star & fork & 交流
DTolm/VkFFT
Vulkan/CUDA/HIP/OpenCL/Level Zero/Metal Fast Fourier Transform library
TheThirdOne/rars
RARS -- RISC-V Assembler and Runtime Simulator
karlrupp/microprocessor-trend-data
Data repository for my blog series on microprocessor trend data.
BBuf/how-to-learn-deep-learning-framework
how to learn PyTorch and OneFlow
zxjzxj9/PyTorchIntroduction
《深入浅出 PyTorch——从模型到源码》源代码和勘误(见Issues)
QMCPACK/qmcpack
Main repository for QMCPACK, an open-source production level many-body ab initio Quantum Monte Carlo code for computing the electronic structure of atoms, molecules, and solids with full performance portable GPU support
raas/mbw
Memory Bandwidth Benchmark
deric/clustering-benchmark
mmperf/mmperf
MatMul Performance Benchmarks for a Single CPU Core comparing both hand engineered and codegen kernels.
FanX-Tek/rk3588-TRM-and-Datasheet
Rockchip rk3588(s) Technical Reference Manual and Datasheet
keithyin/read-pytorch-source-code
pytorch源码阅读 0.2.0 版本
pigirons/conv3x3_m1
This is a demo how to write a high performance convolution run on apple silicon
liuyi12138/ynote2hexo
同步有道云笔记到 hexo 博客
carlushuang/avx_flops
Benchmark cpu flops using avx instructions
Kenway-20/Cluster_work
mklimasz/TI-NBC
Clustering algorithms (TI-)NBC implementation in Cython
misiek1984/NBC
C++ implementation of Neighborhood-Based Clustering Algorithm