DamonZhao-sfu's Stars
amazon-science/redset
Redset is a dataset containing three months worth of user query metadata that ran on a selected sample of instances in the Amazon Redshift fleet. We provide query metadata for 200 provisioned and serverless instances each.
manuzhang/awesome-streaming
a curated list of awesome streaming frameworks, applications, etc
chdb-io/chdb
chDB is an in-process OLAP SQL Engine š powered by ClickHouse
wagjamin/inkfuse
InkFuse - An Experimental Database Runtime Unifying Vectorized and Compiled Query Execution.
duckdblabs/db-benchmark
reproducible benchmark of database-like ops
bheisler/iai
Experimental one-shot benchmarking/profiling harness for Rust
hyrise/tpch_paper
Online Resources for the Paper 'Quantifying TPC-H Choke Points and Their Optimizations'
smola/spark-glusterfs-example
An example of Apache Spark integration with GlusterFS.
uber-common/jvm-profiler
JVM Profiler Sending Metrics to Kafka, Console Output or Custom Reporter
pgvector/pgvector
Open-source vector similarity search for Postgres
intel/BDTK
A modular acceleration toolkit for big data analytic engines
pola-rs/polars
Dataframes powered by a multithreaded, vectorized query engine, written in Rust
gregrahn/join-order-benchmark
Join Order Benchmark (JOB)
HigherOrderCO/Bend
A massively parallel, high-level programming language
apache/datafusion-benchmarks
Apache DataFusion Benchmarks
hpides/autovec-db
Code for our paper "Evaluating SIMD Compiler-Intrinsics for Database Systems"
duckdb/duckdb
DuckDB is an analytical in-process SQL database management system
bigstepinc/SparkBench
Terasort-like benchmark for spark 2.x that uses dataframes, saves files in parquet etc for a more realistic testing.
abstools/timsort-benchmark
Java TimSort Benchmarking
intel/PerTaskMemBWMonitoring
intel/pcm
IntelĀ® Performance Counter Monitor (IntelĀ® PCM)
chipsalliance/chisel
Chisel: A Modern Hardware Design Language
open-mpi/hwloc
Hardware locality (hwloc)
oneapi-src/unified-memory-framework
A library for constructing allocators and memory pools. It also contains broadly useful abstractions and utilities for memory management. UMF allows users to manage multiple memory pools characterized by different attributes, allowing certain allocation types to be isolated from others and allocated using different hardware resources as required.
apache/datafusion-comet
Apache DataFusion Comet Spark Accelerator
apache/datafusion
Apache DataFusion SQL Query Engine
chukonu-team/polars
Fast multi-threaded, hybrid-out-of-core query engine focussing on DataFrame front-ends
Azure-Samples/azure-sparkcruise-samples
Docs for Azure HDInsight
LucaCanali/sparkMeasure
This is the development repository for sparkMeasure, a tool and library designed for efficient analysis and troubleshooting of Apache Spark jobs. It focuses on easing the collection and examination of Spark metrics, making it a practical choice for both developers and data engineers.
tomsisso/spark-profiling-plugin
Spark plugin implementation for profiling a Spark app with context