xie-1399's Stars
hpcaitech/Open-Sora
Open-Sora: Democratizing Efficient Video Production for All
state-spaces/mamba
Mamba SSM architecture
jbush001/NyuziProcessor
GPGPU microprocessor architecture
TheBloodthirster/BUAA_Course_Sharing
北京航空航天大学(北航)课程作业资料共享计划
liangkangnan/tinyriscv
A very simple and easy to understand RISC-V core.
horseee/Awesome-Efficient-LLM
A curated list for Efficient Large Language Models
lukemelas/PyTorch-Pretrained-ViT
Vision Transformer (ViT) in PyTorch
BRTResearch/AIChip_Paper_List
soDLA-publishment/soDLA
Chisel implementation of the NVIDIA Deep Learning Accelerator (NVDLA), with self-driving accelerated
mflowgen/mflowgen
mflowgen -- A Modular ASIC/FPGA Flow Generator
scalesim-project/scale-sim-v2
Repository to host and maintain scale-sim-v2 code
CMU-SAFARI/ramulator2
Ramulator 2.0 is a modern, modular, extensible, and fast cycle-accurate DRAM simulator. It provides support for agile implementation and evaluation of new memory system designs (e.g., new DRAM standards, emerging RowHammer mitigation techniques). Described in our paper https://people.inf.ethz.ch/omutlu/pub/Ramulator2_arxiv23.pdf
lirui-shanghaitech/CNN-Accelerator-VLSI
Convolutional accelerator kernel, target ASIC & FPGA
alimpk/transfomers-silicon-research
Research and Materials on Hardware implementation of Transformer Model
mflowgen/freepdk-45nm
ASIC Design Kit for FreePDK45 + Nangate for use with mflowgen
19801201/SpinalHDL_CNN_Accelerator
CNN accelerator implemented with Spinal HDL
SamsungLabs/Butterfly_Acc
The codes and artifacts associated with our MICRO'22 paper titled: "Adaptable Butterfly Accelerator for Attention-based NNs via Hardware and Algorithm Co-design"
leo47007/TPU-Tensor-Processing-Unit
IC implementation of TPU
Buck008/Transformer-Accelerator-Based-on-FPGA
You can run it on pynq z1. The repository contains the relevant Verilog code, Vivado configuration and C code for sdk testing. The size of the systolic array can be changed, now it is 16X16.
mit-han-lab/spatten-llm
[HPCA'21] SpAtten: Efficient Sparse Attention Architecture with Cascade Token and Head Pruning
Chainsaw-Team/Chainsaw
a hardware design library based on SpinalHDL, especially for stream processing operators on Xilinx FPGAs for Arithmetic, DSP, Communication and Crypto applications
hanm2019/bucket-based_farthest-point-sampling_CPU
the CPU implementation of bucket based farthest point sampling, achieves 7-81x speedup than the conventional implementation
hanm2019/bucket-based_farthest-point-sampling_GPU
the GPU implementation of bucket based farthest point sampling, achieves 3-4x speedup than the conventional implementation
xie-1399/Brief-Chip
The Brief Chip is a Simple Soc project written in Spinal HDL , include a 3 stages RISCV CPU and a CNN Accelerator with RS Dataflow as Peripheral
xie-1399/Nvdla_Spinal
using the SpinalHDL to rebuild the NVDLA Arch
xie-1399/StylePatch
the source code of the StylePatch(a adversarial patch attack method using the local style fusion)
xie-1399/Systolic-Array
The Implement of the Systolic Array Accelerator Archtecture collections
xie-1399/Awesome-Paper
The trace of Paper Reading about DSA、GPU、LLM and AI System
xie-1399/ICTools
Here is some really useful tools for the ASIC/FPGA Design
xie-1399/Multimodal-VLP
The Implement of MultiModal VLP Transformers Alg