rok
Started as a Physicist. Open source contributor. Interested in data science and data tooling.
FreelanceAmsterdam, Netherlands
Pinned Repositories
arrow
Apache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing
asv
Airspeed Velocity: A simple Python benchmarking tool with web-based reporting
awshub
An EC2 based deployment of jupyterhub (based on https://github.com/jupyterhub/jupyterhub-deploy-teaching)
dolines
Map of slovenian dolines
label-wrapper
User friendly image bootstraping framework.
rok.github.io
My tech blog
ssim
IATA Standard Schedules Information Manual file format parser
ssim
IATA Standard Schedules Information Manual file format parser
rok's Repositories
rok/ssim
IATA Standard Schedules Information Manual file format parser
rok/dolines
Map of slovenian dolines
rok/label-wrapper
User friendly image bootstraping framework.
rok/arrow
Apache Arrow is a cross-language development platform for in-memory data. It specifies a standardized language-independent columnar memory format for flat and hierarchical data, organized for efficient analytic operations on modern hardware. It also provides computational libraries and zero-copy streaming messaging and interprocess communication. Languages currently supported include C, C++, Java, JavaScript, Python, and Ruby.
rok/arrow-migration
rok/arrow-site
Mirror of Apache Arrow site
rok/asset-tracker-cloud-firmware
Out-of-tree copy of the Asset Tracker v2 application to show how to use open-source tools for automating the creation of building HEX files, and continuously integrating the firmware against the cloud implementation of the Asset Tracker Example.
rok/aws
Amazon Web Services based implementation of Bifravst
rok/bifravst
Bifravst aims to provide a concrete end-to-end example for an IoT product in the asset tracker space, a Cat Tracker.
rok/cfep
conda-forge's Enhancement Proposal
rok/crossbow
Extra CI for Apache Arrow
rok/datatart.com
Website of datatart.com
rok/DNAscan
DNAscan is a fast and efficient bioinformatics pipeline that allows for the analysis of DNA Next Generation sequencing data, requiring very little computational effort and memory usage.
rok/flownet2-pytorch
Pytorch implementation of FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks
rok/geopandas
Python tools for geographic data
rok/pandas
Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
rok/pydata-pandas-workshop
Material for my PyData Jupyter & Pandas Workshops, I'm also available for personal in-house trainings on request
rok/pyexasol
Exasol python driver with low overhead, fast HTTP transport and compression
rok/pytorch
Tensors and Dynamic neural networks in Python with strong GPU acceleration
rok/scikit-learn
scikit-learn: machine learning in Python
rok/lance
Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckDB, Polars, Pyarrow, with more integrations coming..
rok/parquet-benchmark
Auxiliary files for benchmarking Apache Parquet
rok/parquet-format
Apache Parquet
rok/parquet-testing
Auxiliary files for compatibility and integration tests for Apache Parquet
rok/substrait
A cross platform way to express data transformation, relational algebra, standardized record expression and plans.
rok/test-parquet-cpp
rok/test-parquet-format
rok/test-parquet-java
rok/test-parquet-site
rok/test-parquet-testing