marsupialtail's Stars
y-256/libdivsufsort
A lightweight suffix-sorting library
huggingface/datatrove
Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.
ekzhu/datasketch
MinHash, LSH, LSH Forest, Weighted MinHash, HyperLogLog, HyperLogLog++, LSH Ensemble and HNSW
BatsResearch/alfred
A system for prompted weak supervision.
TracecatHQ/tracecat
The open source Tines / Splunk SOAR alternative.
Daniel-Liu-c0deb0t/simple-saca
Hardware go brrr bounded context suffix array construction algorithm
jxmorris12/bm25_pt
minimal pytorch implementation of bm25 (with sparse tensors)
Mause/duckdb-deltatable-extension
A purely experimental DuckDB Deltalake extension
Vince7778/chip8
A chip8 emulator written in Rust.
apache/iceberg-rust
Apache Iceberg
apache/opendal
Apache OpenDAL: access data freely.
stanfordnlp/dspy
DSPy: The framework for programming—not prompting—foundation models
VikParuchuri/surya
OCR, layout analysis, reading order, line detection in 90+ languages
jiacai2050/prom-remote-api
Prometheus remote storage API for Rust
paradigmxyz/cryo
cryo is the easiest way to extract blockchain data to parquet, csv, json, or python dataframes
timescale/tsbs
Time Series Benchmark Suite, a tool for comparing and evaluating databases for time series data
PyO3/maturin
Build and publish crates with pyo3, cffi and uniffi bindings as well as rust binaries as python packages
jeromefroe/tsz-rs
A crate for time series compression based upon Facebook's Gorilla whitepaper
triton-lang/triton
Development repository for the Triton language and compiler
databrickslabs/transpiler
SIEM-to-Spark Transpiler
VivekPanyam/carton
Run any ML model from any programming language.
THUBear-wjy/LogGrep-zstd
This is the open-source code of paper "LogGrep: Fast and Cheap Cloud Log Storage by Exploiting both Static and Runtime Patterns" (Eurosys'2023), we use zstd as the packing method in this version.
lemire/simdcomp
A simple C library for compressing lists of integers using binary packing
THUBear-wjy/LogReducer
Open-source code for "On the Feasibility of Parser-based Log Compression in Large-Scale Cloud Systems" (USENIX FAST 2021)
dvassallo/s3-benchmark
Measure Amazon S3's performance from any location.
y-scope/clp-ffi-py
clp-ffi-py is a Python library to encode log messages with CLP, and work with the encoded messages using a foreign function interface (FFI).
THUBear-wjy/LogGrep
Open-source repository for paper "LogGrep: Fast and Cheap Cloud Log Storage by Exploiting both Static and Runtime Patterns"(ACM Eurosys 2023)
quickwit-oss/quickwit-datasource
Quickwit data source for Grafana
martende/findex
FM index with regular expressions
facebook/zstd
Zstandard - Fast real-time compression algorithm