SAFARI Research Group at ETH Zurich and Carnegie Mellon University
Site for source code and tools distribution from SAFARI Research Group at ETH Zurich and Carnegie Mellon University.
ETH Zurich and Carnegie Mellon University
Pinned Repositories
DAMOV
DAMOV is a benchmark suite and a methodical framework targeting the study of data movement bottlenecks in modern applications. It is intended to study new architectures, such as near-data processing. Described by Oliveira et al. (preliminary version at https://arxiv.org/pdf/2105.03725.pdf)
MQSim
MQSim is a fast and accurate simulator modeling the performance of modern multi-queue (MQ) SSDs as well as traditional SATA based SSDs. MQSim faithfully models new high-bandwidth protocol implementations, steady-state SSD conditions, and the full end-to-end latency of requests in modern SSDs. It is described in detail in the FAST 2018 paper by Arash Tavakkol et al., "MQSim: A Framework for Enabling Realistic Studies of Modern Multi-Queue SSD Devices" (https://people.inf.ethz.ch/omutlu/pub/MQSim-SSD-simulation-framework_fast18.pdf)
prim-benchmarks
PrIM (Processing-In-Memory benchmarks) is the first benchmark suite for a real-world processing-in-memory (PIM) architecture. PrIM is developed to evaluate, analyze, and characterize the first publicly-available real-world PIM architecture, the UPMEM PIM architecture. Described by Gómez-Luna et al. (https://arxiv.org/abs/2105.03814).
Pythia
A customizable hardware prefetching framework using online reinforcement learning as described in the MICRO 2021 paper by Bera et al. (https://arxiv.org/pdf/2109.12021.pdf).
ramulator
A Fast and Extensible DRAM Simulator, with built-in support for modeling many different DRAM technologies including DDRx, LPDDRx, GDDRx, WIOx, HBMx, and various academic proposals. Described in the IEEE CAL 2015 paper by Kim et al. at http://users.ece.cmu.edu/~omutlu/pub/ramulator_dram_simulator-ieee-cal15.pdf
ramulator-pim
A fast and flexible simulation infrastructure for exploring general-purpose processing-in-memory (PIM) architectures. Ramulator-PIM combines a widely-used simulator for out-of-order and in-order processors (ZSim) with Ramulator, a DRAM simulator with memory models for DDRx, LPDDRx, GDDRx, WIOx, HBMx, and HMCx. Ramulator is described in the IEEE CAL 2015 paper by Kim et al. at https://people.inf.ethz.ch/omutlu/pub/ramulator_dram_simulator-ieee-cal15.pdf Ramulator-PIM is used in the DAC 2019 paper by Singh et al. at https://people.inf.ethz.ch/omutlu/pub/NAPEL-near-memory-computing-performance-prediction-via-ML_dac19.pdf
ramulator2
Ramulator 2.0 is a modern, modular, extensible, and fast cycle-accurate DRAM simulator. It provides support for agile implementation and evaluation of new memory system designs (e.g., new DRAM standards, emerging RowHammer mitigation techniques). Described in our paper https://people.inf.ethz.ch/omutlu/pub/Ramulator2_arxiv23.pdf
rowhammer
Source code for testing the Row Hammer error mechanism in DRAM devices. Described in the ISCA 2014 paper by Kim et al. at http://users.ece.cmu.edu/~omutlu/pub/dram-row-hammer_isca14.pdf.
SoftMC
SoftMC is an experimental FPGA-based memory controller design that can be used to develop tests for DDR3 SODIMMs using a C++ based API. The design, the interface, and its capabilities and limitations are discussed in our HPCA 2017 paper: "SoftMC: A Flexible and Practical Open-Source Infrastructure for Enabling Experimental DRAM Studies" <https://people.inf.ethz.ch/omutlu/pub/softMC_hpca17.pdf>
SparseP
SparseP is the first open-source Sparse Matrix Vector Multiplication (SpMV) software package for real-world Processing-In-Memory (PIM) architectures. SparseP is developed to evaluate and characterize the first publicly-available real-world PIM architecture, the UPMEM PIM architecture. Described by C. Giannoula et al. [https://arxiv.org/abs/2201.05072]
SAFARI Research Group at ETH Zurich and Carnegie Mellon University's Repositories
CMU-SAFARI/ramulator
A Fast and Extensible DRAM Simulator, with built-in support for modeling many different DRAM technologies including DDRx, LPDDRx, GDDRx, WIOx, HBMx, and various academic proposals. Described in the IEEE CAL 2015 paper by Kim et al. at http://users.ece.cmu.edu/~omutlu/pub/ramulator_dram_simulator-ieee-cal15.pdf
CMU-SAFARI/ramulator-pim
A fast and flexible simulation infrastructure for exploring general-purpose processing-in-memory (PIM) architectures. Ramulator-PIM combines a widely-used simulator for out-of-order and in-order processors (ZSim) with Ramulator, a DRAM simulator with memory models for DDRx, LPDDRx, GDDRx, WIOx, HBMx, and HMCx. Ramulator is described in the IEEE CAL 2015 paper by Kim et al. at https://people.inf.ethz.ch/omutlu/pub/ramulator_dram_simulator-ieee-cal15.pdf Ramulator-PIM is used in the DAC 2019 paper by Singh et al. at https://people.inf.ethz.ch/omutlu/pub/NAPEL-near-memory-computing-performance-prediction-via-ML_dac19.pdf
CMU-SAFARI/SoftMC
SoftMC is an experimental FPGA-based memory controller design that can be used to develop tests for DDR3 SODIMMs using a C++ based API. The design, the interface, and its capabilities and limitations are discussed in our HPCA 2017 paper: "SoftMC: A Flexible and Practical Open-Source Infrastructure for Enabling Experimental DRAM Studies" <https://people.inf.ethz.ch/omutlu/pub/softMC_hpca17.pdf>
CMU-SAFARI/DAMOV
DAMOV is a benchmark suite and a methodical framework targeting the study of data movement bottlenecks in modern applications. It is intended to study new architectures, such as near-data processing. Described by Oliveira et al. (preliminary version at https://arxiv.org/pdf/2105.03725.pdf)
CMU-SAFARI/SparseP
SparseP is the first open-source Sparse Matrix Vector Multiplication (SpMV) software package for real-world Processing-In-Memory (PIM) architectures. SparseP is developed to evaluate and characterize the first publicly-available real-world PIM architecture, the UPMEM PIM architecture. Described by C. Giannoula et al. [https://arxiv.org/abs/2201.05072]
CMU-SAFARI/SneakySnake
SneakySnake:snake: is the first and the only pre-alignment filtering algorithm that works efficiently and fast on modern CPU, FPGA, and GPU architectures. It greatly (by more than two orders of magnitude) expedites sequence alignment calculation for both short and long reads. Described in the Bioinformatics (2020) by Alser et al. https://arxiv.org/abs/1910.09020.
CMU-SAFARI/BLEND
BLEND is a mechanism that can efficiently find fuzzy seed matches between sequences to significantly improve the performance and accuracy while reducing the memory space usage of two important applications: 1) finding overlapping reads and 2) read mapping. Described by Firtina et al. (published in NARGAB https://doi.org/10.1093/nargab/lqad004)
CMU-SAFARI/Scrooge
Scrooge is a high-performance pairwise sequence aligner based on the GenASM algorithm. Scrooge includes three novel algorithmic improvements on top of GenASM, and high-performance CPU and GPU implementations. Described by Lindegger et al. at https://doi.org/10.48550/arXiv.2208.09985
CMU-SAFARI/Sibyl
Source code for the software implementation of Sibyl proposed in our ISCA 2022 paper: Gagandeep Singh et. al., "Sibyl: Adaptive and Extensible Data Placement in Hybrid Storage Systems using Online Reinforcement Learning" at https://people.inf.ethz.ch/omutlu/pub/Sibyl_RL-based-data-placement-in-hybrid-storage-systems_isca22.pdf
CMU-SAFARI/GenASM
Source code for the software implementations of the GenASM algorithms proposed in our MICRO 2020 paper: Senol Cali et. al., "GenASM: A High-Performance, Low-Power Approximate String Matching Acceleration Framework for Genome Sequence Analysis" at https://people.inf.ethz.ch/omutlu/pub/GenASM-approximate-string-matching-framework-for-genome-analysis_micro20.pdf
CMU-SAFARI/FastRemap
FastRemap, a C++ tool for quickly remapping reads between genome assemblies based on the commonly used CrossMap tool. Link to paper: https://arxiv.org/pdf/2201.06255.pdf
CMU-SAFARI/NOCulator
NOCulator is a network-on-chip simulator providing cycle-accurate performance models for a wide variety of networks (mesh, torus, ring, hierarchical ring, flattened butterfly) and routers (buffered, bufferless, Adaptive Flow Control, minBD, HiRD).
CMU-SAFARI/SPARTA
A novel spatial accelerator for horizontal diffusion weather stencil computation, as described in ICS 2023 paper by Singh et al. (https://arxiv.org/pdf/2303.03509.pdf)
CMU-SAFARI/pim-ml
PIM-ML is a benchmark for training machine learning algorithms on the UPMEM architecture, which is the first publicly-available real-world processing-in-memory (PIM) architecture. Described in the ISPASS 2023 paper by Gomez-Luna et al. (https://arxiv.org/pdf/2207.07886.pdf).
CMU-SAFARI/BlockHammer
Source code for the cycle-level simulator and RTL implementation of BlockHammer proposed in our HPCA 2021 paper: Yaglikci et. al., "BlockHammer: Preventing RowHammer at Low Cost by Blacklisting Rapidly-Accessed DRAM Rows" at https://people.inf.ethz.ch/omutlu/pub/BlockHammer_preventing-DRAM-rowhammer-at-low-cost_hpca21.pdf
CMU-SAFARI/pLUTo
pLUTo is a DRAM-based Processing-using-Memory architecture that leverages the high density of DRAM to enable the massively parallel storing and querying of lookup tables (LUTs)
CMU-SAFARI/NATSA
NATSA is the first near-data-processing accelerator for time series analysis based on the Matrix Profile (SCRIMP) algorithm. NATSA exploits modern 3D-stacked High Bandwidth Memory (HBM) to enable efficient and fast matrix profile computation near memory. Described in ICCD 2020 by Fernandez et al. https://people.inf.ethz.ch/omutlu/pub/NATSA_time-series-analysis-near-data_iccd20.pdf
CMU-SAFARI/U-TRR
Source code of the U-TRR methodology presented in "Uncovering In-DRAM RowHammer Protection Mechanisms: A New Methodology, Custom RowHammer Patterns, and Implications", https://people.inf.ethz.ch/omutlu/pub/U-TRR-uncovering-RowHammer-protection-mechanisms_micro21.pdf
CMU-SAFARI/SeGraM
Source code for the software implementation of SeGraM proposed in our ISCA 2022 paper: Senol Cali et. al., "SeGraM: A Universal Hardware Accelerator for Genomic Sequence-to-Graph and Sequence-to-Sequence Mapping" at https://people.inf.ethz.ch/omutlu/pub/SeGraM_genomic-sequence-mapping-universal-accelerator_isca22.pdf
CMU-SAFARI/transpimlib
TransPimLib is a library for transcendental (and other hard-to-calculate) functions in general-purpose PIM systems, TransPimLib provides CORDIC-based and LUT-based methods for trigonometric functions, hyperbolic functions, exponentiation, logarithm, square root, etc. Described in ISPASS'23 paper by Item et al. (https://arxiv.org/pdf/2304.01951.pdf)
CMU-SAFARI/ApHMM-GPU
ApHMM-GPU is the first GPU implementation of the Baum-Welch algorithm for profile Hidden Markov Models (pHMMs). It includes many of the software optimizations as proposed in the ApHMM paper, which is described by Firtina et al. (preliminary version at https://arxiv.org/abs/2207.09765).
CMU-SAFARI/DRAM-Datasheet-Survey
A survey of manufacturer-provided DRAM operating parameters and timings as specified by DRAM chip datasheets from between 1970 and 2021. This data and its analysis are described in the 2022 paper by Patel et al.: https://arxiv.org/abs/2204.10378
CMU-SAFARI/MetaSys
Metasys is the first open-source FPGA-based infrastructure with a prototype in a RISC-V core, to enable the rapid implementation and evaluation of a wide range of cross-layer software/hardware cooperative techniques techniques in real hardware. Described in our ACM TACO paper: https://dl.acm.org/doi/full/10.1145/3505250
CMU-SAFARI/GateSeeder
GateSeeder is the first near-memory CPU-FPGA co-design for alleviating both the compute-bound and memory-bound bottlenecks in short and long-read mapping. GateSeeder outperforms Minimap2 by up to 40.3×, 4.8×, and 2.3× when mapping real ONT, HiFi, and Illumina reads, respectively.
CMU-SAFARI/Molecules2Variations
The first work to provide a comprehensive survey of a prominent set of algorithmic improvement and hardware acceleration efforts for the entire genome analysis pipeline used for the three most prominent sequencing data, short reads (Illumina), ultra-long reads (ONT), and accurate long reads (HiFi). Described in arXiv (2022) by Alser et al. https://arxiv.org/abs/2205.07957
CMU-SAFARI/QUAC-TRNG
All sources to reproduce the results presented in our paper, QUAC-TRNG, the highest-throughput DRAM-based true random number generator, described in https://people.inf.ethz.ch/omutlu/pub/QUAC-TRNG-DRAM_isca21.pdf
CMU-SAFARI/sasiml
An extensible and fully programable cycle-accurate Spatial Architecture simulator for Machine Learning Workloads. Described in detail in the IEEE TC 2023 paper by Orosa et al. at https://arxiv.org/abs/2202.02310
CMU-SAFARI/TargetCall
TargetCall is the first pre-basecalling filter that is applicable to a wide range of use cases to eliminate wasted computation in basecalling. Described in our preprint: https://arxiv.org/abs/2212.04953
CMU-SAFARI/alignment-in-memory
AIM (Alignment-in-Memory), A Framework for High-throughput Sequence Alignment using Real Processing-in-Memory Systems, Bioinformatics, btad155, https://doi.org/10.1093/bioinformatics/btad155
CMU-SAFARI/BioDynaMo
BioDynamo is a flexible and high-performance agent based simulation engine. This repository contains artifacts and materials to support the reproducibility of the paper: Breitwieser et al., "High-Performance and Scalable Agent-Based Simulation with BioDynaMo," accepted to PPoPP '23: https://arxiv.org/pdf/2301.06984.pdf