xie-1399

Computer Arch & Domain Accelerator & Machine Learning :)

BUAA中国

xie-1399's Stars

hpcaitech/Open-Sora
Open-Sora: Democratizing Efficient Video Production for All
Language:Python20.3k 176 3531.9k
state-spaces/mamba
Mamba SSM architecture
Language:Python11.5k 98 402943
jbush001/NyuziProcessor
GPGPU microprocessor architecture
Language:C2k 141 168348
TheBloodthirster/BUAA_Course_Sharing
北京航空航天大学(北航)课程作业资料共享计划
Language:Mathematica1.4k 22 6281
liangkangnan/tinyriscv
A very simple and easy to understand RISC-V core.
Language:C1k 17 8181
horseee/Awesome-Efficient-LLM
A curated list for Efficient Large Language Models
Language:Python946 37 172
lukemelas/PyTorch-Pretrained-ViT
Vision Transformer (ViT) in PyTorch
Language:Python761 10 30124
BRTResearch/AIChip_Paper_List
541 38 3110
soDLA-publishment/soDLA
Chisel implementation of the NVIDIA Deep Learning Accelerator (NVDLA), with self-driving accelerated
Language:Scala214 21 1947
mflowgen/mflowgen
mflowgen -- A Modular ASIC/FPGA Flow Generator
Language:Python208 19 3351
scalesim-project/scale-sim-v2
Repository to host and maintain scale-sim-v2 code
Language:Python191 4 6290
CMU-SAFARI/ramulator2
Ramulator 2.0 is a modern, modular, extensible, and fast cycle-accurate DRAM simulator. It provides support for agile implementation and evaluation of new memory system designs (e.g., new DRAM standards, emerging RowHammer mitigation techniques). Described in our paper https://people.inf.ethz.ch/omutlu/pub/Ramulator2_arxiv23.pdf
Language:C++183 13 4142
lirui-shanghaitech/CNN-Accelerator-VLSI
Convolutional accelerator kernel, target ASIC & FPGA
Language:Verilog148 3 222
alimpk/transfomers-silicon-research
Research and Materials on Hardware implementation of Transformer Model
Language:Jupyter Notebook146 6 121
mflowgen/freepdk-45nm
ASIC Design Kit for FreePDK45 + Nangate for use with mflowgen
Language:Verilog126 8 334
19801201/SpinalHDL_CNN_Accelerator
CNN accelerator implemented with Spinal HDL
Language:Scala125 6 333
SamsungLabs/Butterfly_Acc
The codes and artifacts associated with our MICRO'22 paper titled: "Adaptable Butterfly Accelerator for Attention-based NNs via Hardware and Algorithm Co-design"
Language:Verilog97 2 515
leo47007/TPU-Tensor-Processing-Unit
IC implementation of TPU
Language:Verilog84 5 026
Buck008/Transformer-Accelerator-Based-on-FPGA
You can run it on pynq z1. The repository contains the relevant Verilog code, Vivado configuration and C code for sdk testing. The size of the systolic array can be changed, now it is 16X16.
Language:Verilog80 1 25
mit-han-lab/spatten-llm
[HPCA'21] SpAtten: Efficient Sparse Attention Architecture with Cascade Token and Head Pruning
Language:Scala55 8 13
Chainsaw-Team/Chainsaw
a hardware design library based on SpinalHDL, especially for stream processing operators on Xilinx FPGAs for Arithmetic, DSP, Communication and Crypto applications
Language:Scala50 3 219
hanm2019/bucket-based_farthest-point-sampling_CPU
the CPU implementation of bucket based farthest point sampling, achieves 7-81x speedup than the conventional implementation
Language:C++7 1 11
hanm2019/bucket-based_farthest-point-sampling_GPU
the GPU implementation of bucket based farthest point sampling, achieves 3-4x speedup than the conventional implementation
Language:Cuda7 1 00
xie-1399/Brief-Chip
The Brief Chip is a Simple Soc project written in Spinal HDL , include a 3 stages RISCV CPU and a CNN Accelerator with RS Dataflow as Peripheral
Language:Scala3 1 00
xie-1399/Nvdla_Spinal
using the SpinalHDL to rebuild the NVDLA Arch
Language:Verilog3 1 00
xie-1399/StylePatch
the source code of the StylePatch（a adversarial patch attack method using the local style fusion）
Language:Python20
xie-1399/Systolic-Array
The Implement of the Systolic Array Accelerator Archtecture collections
Language:Scala2 1 30
xie-1399/Awesome-Paper
The trace of Paper Reading about DSA、GPU、LLM and AI System
1 1 00
xie-1399/ICTools
Here is some really useful tools for the ASIC/FPGA Design
Language:Verilog1 1 00
xie-1399/Multimodal-VLP
The Implement of MultiModal VLP Transformers Alg
1 1 0

xie-1399

xie-1399's Stars

hpcaitech/Open-Sora

state-spaces/mamba

jbush001/NyuziProcessor

TheBloodthirster/BUAA_Course_Sharing

liangkangnan/tinyriscv

horseee/Awesome-Efficient-LLM

lukemelas/PyTorch-Pretrained-ViT

BRTResearch/AIChip_Paper_List

soDLA-publishment/soDLA

mflowgen/mflowgen

scalesim-project/scale-sim-v2

CMU-SAFARI/ramulator2

lirui-shanghaitech/CNN-Accelerator-VLSI

alimpk/transfomers-silicon-research

mflowgen/freepdk-45nm

19801201/SpinalHDL_CNN_Accelerator

SamsungLabs/Butterfly_Acc

leo47007/TPU-Tensor-Processing-Unit

Buck008/Transformer-Accelerator-Based-on-FPGA

mit-han-lab/spatten-llm

Chainsaw-Team/Chainsaw

hanm2019/bucket-based_farthest-point-sampling_CPU

hanm2019/bucket-based_farthest-point-sampling_GPU

xie-1399/Brief-Chip

xie-1399/Nvdla_Spinal

xie-1399/StylePatch

xie-1399/Systolic-Array

xie-1399/Awesome-Paper

xie-1399/ICTools

xie-1399/Multimodal-VLP