hongsunjang's Stars
karpathy/llama2.c
Inference Llama 2 in one file of pure C
pybind/pybind11
Seamless operability between C++11 and Python
NVIDIA/open-gpu-kernel-modules
NVIDIA Linux open GPU kernel module source
naklecha/llama3-from-scratch
llama3 implementation one matrix multiplication at a time
google-research/arxiv-latex-cleaner
arXiv LaTeX Cleaner: Easily clean the LaTeX code of your paper to submit to arXiv
ufrisk/pcileech
Direct Memory Access (DMA) Attack Software
intel/pcm
Intelยฎ Performance Counter Monitor (Intelยฎ PCM)
Xilinx/PYNQ
Python Productivity for ZYNQ
hao-ai-lab/LookaheadDecoding
[ICML 2024] Break the Sequential Dependency of LLM Inference Using Lookahead Decoding
ufrisk/pcileech-fpga
FPGA modules used together with the PCILeech Direct Memory Access (DMA) Attack Software
NVIDIA/gdrcopy
A fast GPU memory copy library based on NVIDIA GPUDirect RDMA technology
Xilinx/Vitis_Libraries
Vitis Libraries
KastnerRG/pp4fpgas
Parallel Programming for FPGAs -- An open-source high-level synthesis book
hpcaitech/FastFold
Optimizing AlphaFold Training and Inference on GPU Clusters
SHI-Labs/NATTEN
Neighborhood Attention Extension. Bringing attention to a neighborhood near you!
SqueezeAILab/KVQuant
[NeurIPS 2024] KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization
NVIDIA/gds-nvidia-fs
NVIDIA GPUDirect Storage Driver
rapidstream-org/rapidstream-tapa
RapidStream TAPA compiles task-parallel HLS program into high-frequency FPGA accelerators.
itsnamgyu/block-transformer
Block Transformer: Global-to-Local Language Modeling for Fast Inference (NeurIPS 2024)
ZaidQureshi/bam
ogiroux/freestanding
casys-kaist/NeuPIMs
NeuPIMs Simulator
SNU-ARC/Ginex
Ginex: SSD-enabled Billion-scale Graph Neural Network Training on a Single Machine via Provably Optimal In-memory Caching
template-hls/template-hls-float
jaewonalive/PeerAiD
UCLA-VAST/Serpens
Serpens is an HBM FPGA accelerator for SpMV
svn2github/pagecache-management
This is a clone of an SVN repository at http://pagecache-mangagement.googlecode.com/svn/trunk. It had been cloned by http://svn2github.com/ , but the service was since closed. Please read a closing note on my blog post: http://piotr.gabryjeluk.pl/blog:closing-svn2github . If you want to continue synchronizing this repo, look at https://github.com/gabrys/svn2github
gem5-hpca-2024/gem5
ogiroux/libcxx
Mirror of official libcxx git repository located at http://llvm.org/git/libcxx. Updated every five minutes.
tjruwase/transformers
๐ค Transformers: State-of-the-art Natural Language Processing for Pytorch, TensorFlow, and JAX.