chenxfeng's Stars
CyC2018/CS-Notes
:books: 技术面试必备基础知识、Leetcode、计算机操作系统、计算机网络、系统设计
pytorch/pytorch
Tensors and Dynamic neural networks in Python with strong GPU acceleration
521xueweihan/GitHub520
:kissing_heart: 让你“爱”上 GitHub,解决访问时图裂、加载慢的问题。(无需安装)
PaddlePaddle/Paddle
PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)
apache/mxnet
Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more
NVIDIA/open-gpu-kernel-modules
NVIDIA Linux open GPU kernel module source
zhisheng17/flink-learning
flink learning blog. http://www.54tianzhisheng.cn/ 含 Flink 入门、概念、原理、实战、性能调优、源码解析等内容。涉及 Flink Connector、Metrics、Library、DataStream API、Table API & SQL 等内容的学习案例,还有 Flink 落地应用的大型项目案例(PVUV、日志存储、百亿数据实时去重、监控告警)分享。欢迎大家支持我的专栏《大数据实时计算引擎 Flink 实战与性能优化》
apache/tvm
Open deep learning compiler stack for cpu, gpu and specialized accelerators
NVIDIA/TensorRT
NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.
openvinotoolkit/openvino
OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference
intel-analytics/ipex-llm
Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, Mixtral, Gemma, Phi, MiniCPM, Qwen-VL, MiniCPM-V, etc.) on Intel XPU (e.g., local PC with iGPU and NPU, discrete GPU such as Arc, Flex and Max); seamlessly integrate with llama.cpp, Ollama, HuggingFace, LangChain, LlamaIndex, vLLM, GraphRAG, DeepSpeed, Axolotl, etc
mindspore-ai/mindspore
MindSpore is a new open source deep learning training/inference framework that could be used for mobile, edge and cloud scenarios.
NervanaSystems/neon
Intel® Nervana™ reference deep learning framework committed to best performance on all hardware
oneapi-src/oneDNN
oneAPI Deep Neural Network Library (oneDNN)
alibaba/Alink
Alink is the Machine Learning algorithm platform based on Flink, developed by the PAI team of Alibaba computing platform.
ARM-software/ComputeLibrary
The Compute Library is a set of computer vision and machine learning functions optimised for both Arm CPUs and GPUs using SIMD technologies.
soumith/convnet-benchmarks
Easy benchmarking of all publicly accessible implementations of convnets
ispc/ispc
Intel® Implicit SPMD Program Compiler
msys2/msys2
A software distro and building platform for Windows
intelxed/xed
The X86 Encoder Decoder (XED), is a software library for encoding and decoding X86 (IA32 and Intel64) instructions
princeton-nlp/MeZO
[NeurIPS 2023] MeZO: Fine-Tuning Language Models with Just Forward Passes. https://arxiv.org/abs/2305.17333
NVIDIA/gdrcopy
A fast GPU memory copy library based on NVIDIA GPUDirect RDMA technology
libxsmm/libxsmm
Library for specialized dense and sparse matrix operations, and deep learning primitives.
NVIDIA/caffe
Caffe: a fast open framework for deep learning.
jmeubank/tdm-gcc
TDM-GCC is a cleverly disguised GCC compiler for Windows!
keystone-enclave/keystone
Keystone Enclave (QEMU + HiFive Unleashed)
numactl/numactl
NUMA support for Linux
nascab/nascab-web
daadaada/turingas
Assembler for NVIDIA Volta and Turing GPUs
xingyul/sparse-winograd-cnn
Efficient Sparse-Winograd Convolutional Neural Networks (ICLR 2018)