jaceklaskowski
Freelance Data(bricks) Engineer | #ApacheSpark #DeltaLake #UnityCatalog #Databricks #ApacheKafka #KafkaStreams | Java Champion | @apache | #DatabricksBeacons
Freelance Data(bricks) EngineerWarsaw, Poland
Pinned Repositories
learn-databricks
Notebooks to learn Databricks Lakehouse Platform
scalania
Learn Scala by examples
spark-workshop
Apache Spark™ and Scala Workshops
apache-spark-internals
The Internals of Apache Spark
delta-lake-internals
The Internals of Delta Lake
spark-sql-internals
The Internals of Spark SQL
spark-structured-streaming-internals
The Internals of Spark Structured Streaming
unity-catalog-internals
The Internals of Unity Catalog
jaceklaskowski's Repositories
jaceklaskowski/spark-workshop
Apache Spark™ and Scala Workshops
jaceklaskowski/kafka-notebook
The Internals of Apache Kafka
jaceklaskowski/spark-kubernetes-book
The Internals of Spark on Kubernetes
jaceklaskowski/kafka-workshop
Materials (slides and code) for Kafka and Kafka Streams Workshops
jaceklaskowski/spark-delta-lake-workshop
Spark and Delta Lake Workshop
jaceklaskowski/learn-databricks
Notebooks to learn Databricks Lakehouse Platform
jaceklaskowski/scala-academy
Scala Academy
jaceklaskowski/spark-meetups
Learning Spark on Kubernetes in a series of Warsaw Data Engineering meetups online!
jaceklaskowski/jaceklaskowski.github.io
Website
jaceklaskowski/spark-examples
Apache Spark Examples
jaceklaskowski/spark
Mirror of Apache Spark
jaceklaskowski/delta
An open-source storage layer that brings scalable, ACID transactions to Apache Spark™ and big data workloads.
jaceklaskowski/trino-meetups
Learning Trino in a series of Warsaw Data Engineering meetups online!
jaceklaskowski/unitycatalog
Open, Multi-modal Catalog for Data & AI
jaceklaskowski/ccloud-gitpod-demo
demo ccloud + gitpod
jaceklaskowski/cloud-bigtable-examples
Examples of how to use Cloud Bigtable both with GCE map/reduce as well as stand alone applications.
jaceklaskowski/dbt-spark
dbt-spark contains all of the code enabling dbt to work with Apache Spark and Databricks
jaceklaskowski/docs
Linode guides and tutorials.
jaceklaskowski/docusaurus-tutorial
https://docusaurus.io/docs/en/tutorial-setup
jaceklaskowski/horovod
Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.
jaceklaskowski/hudi
Upserts, Deletes And Incremental Processing on Big Data.
jaceklaskowski/iceberg
Apache Iceberg
jaceklaskowski/jaceklaskowski
My personal repository
jaceklaskowski/java-docs-samples
Java and Kotlin Code samples used on cloud.google.com
jaceklaskowski/ksql
The database purpose-built for stream processing applications.
jaceklaskowski/LightGBM
A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.
jaceklaskowski/scala-academy-sandbox
jaceklaskowski/spark-flowchart
Flowchart for debugging Spark aplications
jaceklaskowski/xgboost
Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow
jaceklaskowski/yurii-double-metrics
Spark app to demo multiple executions of flatMapGroupsWithState's stateUpdateFunc when used with DeltaTable.merge