apache-arrow
There are 134 repositories under apache-arrow topic.
pixie-io/pixie
Instant Kubernetes-Native Application Observability
lancedb/lance
Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckDB, Polars, Pyarrow, and PyTorch with more integrations coming..
aws/aws-sdk-pandas
pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, Neptune, OpenSearch, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).
polarsignals/frostdb
❄️ Coolest database around 🧊 Embeddable column database written in Go.
scikit-hep/awkward
Manipulate JSON-like data with NumPy-like idioms.
visgl/loaders.gl
Loaders for big data visualization. Website:
developmentseed/lonboard
A Python library for fast, interactive geospatial vector data visualization in Jupyter.
geopolars/geopolars
Geospatial extensions for Polars
unum-cloud/ustore
Multi-Modal Database replacing MongoDB, Neo4J, and Elastic with 1 faster ACID solution, with NetworkX and Pandas interfaces, and bindings for C 99, C++ 17, Python 3, Java, GoLang 🗄️
kylebarron/parquet-wasm
Rust-based WebAssembly bindings to read and write Apache Parquet data
geoarrow/geoarrow
Specification for storing geospatial data in Apache Arrow
1duo/awesome-ai-infrastructures
Infrastructures™ for Machine Learning Training/Inference in Production.
geoarrow/geoarrow-rs
GeoArrow in Rust, Python, and JavaScript (WebAssembly) with vectorized geometry operations
apache/arrow-julia
Official Julia implementation of Apache Arrow
nevi-me/rust-dataframe
A Rust DataFrame implementation, built on Apache Arrow
cldellow/sqlite-parquet-vtable
A SQLite vtable extension to read Parquet files
abs-tudelft/fletcher
Fletcher: A framework to integrate FPGA accelerators with Apache Arrow
scikit-hep/awkward-0.x
Manipulate arrays of complex data structures as easily as Numpy.
G-Research/ParquetSharp
ParquetSharp is a .NET library for reading and writing Apache Parquet files.
google/space
Unified storage framework for the entire machine learning lifecycle
nanoporetech/pod5-file-format
Pod5: a high performance file format for nanopore reads.
mattf96s/QuackDB
Open-source in-browser DuckDB SQL editor
kylebarron/arro3
A minimal Python library for Apache Arrow, connecting to the Rust arrow crate
apache/arrow-go
Official Go implementation of Apache Arrow
geoarrow/deck.gl-layers
deck.gl layers for rendering GeoArrow data
kylebarron/arrow-js-ffi
Zero-copy reading of Arrow data from WebAssembly
mongodb-labs/mongo-arrow
MongoDB integrations for Apache Arrow. Export MongoDB documents to numpy array, parquet files, and pandas dataframes in one line of code.
cmudig/falcon-vis
Cross-filter millions (or even billions) of data entries with no interaction delay
igor-suhorukov/openstreetmap_h3
OSM planet dump high performance data loader. Transform OpenStreetMap World/Region PBF dump into partitioned by H3 regions PostGIS pgsnapshot (lossless) OSM schema representation and/or into ArrowIPC/Parquet dumps
man-group/sparrow
C++20 idiomatic APIs for the Apache Arrow Columnar Format
duo-rs/duo
A lightweight Logging and Tracing observability solution for Rust, built with Apache Arrow, Apache Parquet and Apache DataFusion.
abdenlab/oxbow
Read specialized NGS formats as data frames in R, Python, and more.
red-data-tools/red_amber
A dataframe library for Rubyists.
cldellow/csv2parquet
Convert a CSV to a parquet file.
elixir-explorer/adbc
Apache Arrow ADBC bindings for Elixir
baggiponte/awesome-pandas-alternatives
Awesome list of alternative dataframe libraries in Python.