ruslandanilin
I believe that behind every AI breakthrough lies robust data infrastructure and powerful tools. I'm passionate about architecting the platforms that drive it.
@DeviantArt / @wixVictoria, Canada
ruslandanilin's Stars
apache/spark
Apache Spark - A unified analytics engine for large-scale data processing
apache/kafka
Mirror of Apache Kafka
apache/flink
Apache Flink
dgraph-io/dgraph
The high-performance database for modern applications
recommenders-team/recommenders
Best Practices on Recommendation Systems
vectordotdev/vector
A high-performance observability data pipeline.
prestodb/presto
The official home of the Presto distributed SQL query engine for big data
apache/hadoop
Apache Hadoop
OpenRA/OpenRA
Open Source real-time strategy game engine for early Westwood games such as Command & Conquer: Red Alert written in C# using SDL and OpenGL. Runs on Windows, Linux, *BSD and Mac OS X.
fluent/fluentd
Fluentd: Unified Logging Layer (project under CNCF)
elastic/beats
:tropical_fish: Beats - Lightweight shippers for Elasticsearch & Logstash
debezium/debezium
Change data capture for a variety of databases. Please log issues at https://issues.redhat.com/browse/DBZ.
onceupon/Bash-Oneliner
A collection of handy Bash One-Liners and terminal tricks for data processing and Linux system maintenance.
apache/iceberg
Apache Iceberg
vespa-engine/vespa
AI + Data, online. https://vespa.ai
apache/hbase
Apache HBase
redpanda-data/console
Redpanda Console is a developer-friendly UI for managing your Kafka/Redpanda workloads. Console gives you a simple, interactive approach for gaining visibility into your topics, masking data, managing consumer groups, and exploring real-time data with time-travel debugging.
wzchen/probability_cheatsheet
A comprehensive 10-page probability cheatsheet that covers a semester's worth of introduction to probability.
jesselpalmer/the-engineering-managers-booklist
Books for people who are or aspire to manage/lead team(s) of software engineers
DREAM-DK/MAKRO
MaurizioFD/RecSys2019_DeepLearning_Evaluation
This is the repository of our article published in RecSys 2019 "Are We Really Making Much Progress? A Worrying Analysis of Recent Neural Recommendation Approaches" and of several follow-up studies.
Netflix/iceberg
Iceberg is a table format for large, slow-moving tabular data
wix-incubator/quix
Quix Notebook Manager
swoop-inc/spark-alchemy
Collection of open-source Spark tools & frameworks that have made the data engineering and data science teams at Swoop highly productive
memiiso/debezium-server-iceberg
Replicates any database (CDC events) to Apache Iceberg (To Cloud Storage)
rdblue/s3committer
Hadoop output committers for S3
spotify/async-datastore-client
A modern and feature-rich Asynchronous Java client for Google Cloud Datastore
spotify/docker-bigtable
A docker container with an in memory implementation of Google Cloud Bigtable
findify/featury
Friendly ML feature store
wix-incubator/wix-webdriver-manager
Wix WebDriver Manager