maropu
OSS engineer@R&D, Ph.D. in CS (Database Systems) - Apache Spark PMC&committer, Apache Hivemall PPMC, PostgreSQL enthusiast - LLVM/C/C++03/Java/Scala/Rust/Python
Tokyo/Japan
Pinned Repositories
incubator-hivemall
Mirror of Apache Hivemall (incubating)
datasketches-spark
Data Sketches for Apache Spark
hivemall-spark
A Hivemall wrapper for Spark
integer_encoding_library
An encoder/decoder collection for a sequence of integers
lljvm-translator
A lightweight library to inject LLVM bitcode into JVMs
spark-data-repair-plugin
Provide functionality to build statistical models to repair dirty tabular data in Spark
spark-sql-flow-plugin
Visualize column-level data lineage in Spark SQL
spark-sql-server
Yet Another Spark SQL JDBC/ODBC server based on the PostgreSQL V3 protocol
spark-tpcds-datagen
All the things about TPC-DS in Apache Spark
vpacker
A simple integer compression library for C/C++/Java
maropu's Repositories
maropu/spark-tpcds-datagen
All the things about TPC-DS in Apache Spark
maropu/spark-sql-flow-plugin
Visualize column-level data lineage in Spark SQL
maropu/spark-sql-server
Yet Another Spark SQL JDBC/ODBC server based on the PostgreSQL V3 protocol
maropu/datasketches-spark
Data Sketches for Apache Spark
maropu/spark-data-repair-plugin
Provide functionality to build statistical models to repair dirty tabular data in Spark
maropu/spark-query-log-plugin
A simple toolkit to analyze Spark query logs
maropu/fuzz-testing-for-spark
[WIP] Run SQL-aware fuzz tests for the Catalyst optimizer in Apache Spark
maropu/spark-graphx-pregel-personalized-pagerank
Personalized PageRank on Pregel/GraphX
maropu/mlflow-example
An example code for MLflow
maropu/spark-executor-dict-plugin
Fast Read-only Data Dictionary Attached to Each Spark Executor
maropu/jupyterlab-dockerfile
A docker file for JupyterLab including pyspark
maropu/jvmci-test
A toy box to test JVMCI in JDK11
maropu/predictive-testing
maropu/equipartitioning-example
Equipartitioning in Spark
maropu/janino
Janino is a super-small, super-fast Java™ compiler.
maropu/LAMA
LAnguage Model Analysis
maropu/link-prediction-with-anyburl
maropu/lstm-crf-pytorch
LSTM-CRF in PyTorch
maropu/maropu
maropu/neon
Neon: Serverless Postgres. We separated storage and compute to offer autoscaling, branching, and bottomless storage.
maropu/pg_stats_exporter
A PostgreSQL metrics exporter for Prometheus.
maropu/pgvector
Open-source vector similarity search for Postgres
maropu/polars
Fast multi-threaded, hybrid-out-of-core DataFrame library in Rust | Python | Node.js
maropu/pydeps-neo4j
Exports Python package dependencies into Neo4j
maropu/rag-postgres
A trial place for RAG with PostgreSQL resources
maropu/sedona
A cluster computing framework for processing large-scale geospatial data
maropu/spark
Mirror of Apache Spark
maropu/spark-sql-perf
maropu/spark-tpcds-sf-1
TPC-DS queries with 1GB scale factor
maropu/spark-website
Mirror of Apache Spark Website