alastor1729

Scala Spark Data Engineer && PySpark Data Engineer && dog-owner

Remote

Pinned Repositories

actorbintree
From edX Online Course: using Scala and Akka Actors, create an actor-based binary tree set where each node is represented by one actor
Language:Scala2 1 01
AI-For-Beginners-fork
12 Weeks, 24 Lessons, AI for All!
Language:Jupyter Notebook00
akka-cassandra-demo
The repository for the demonstration of Akka & Cassandra integration
Language:Scala0 0 00
akka-dlt-poc
Simple project for a "Jobcoin Digital Ledger / DLT-based" POC using Scala 2.13.5, SBT (1.5.3), Typed Akka Actors (2.6.14), and Akka HTTP (10.2.4)
Language:Scala00
dbldatagen-forked
Generate relevant synthetic data quickly for your projects. The Databricks Labs synthetic data generator (aka `dbldatagen`) may be used to generate large simulated / synthetic data sets for test, POCs, and other uses in Databricks environments including in Delta Live Tables pipelines
Language:Python0 0 00
rockthejvm-spark-essentials-FORK
The official repository for the Rock the JVM Spark Essentials with Scala course
Language:Scala0 0 00
scala-postgres-tutorial
Showing example of using Http4s + Doobie + Postgres combination with minimal implementations
Language:Scala21
scala_pekko_streams_tutorial_FORK
A collection of runnable and self-contained examples inspired by various akka-streams (pekko-streams), Alpakka (Pekko connectors) and akka-http (pekko-http) docs, tutorials and blogs
Language:Scala0 0 00
spark
Apache Spark - A unified analytics engine for large-scale data processing
Language:Scala01
starwars-scala-rest-api
a command-line Scala && Akka application that takes a planet name from the Star Wars universe and returns a list of people that are from that planet
Language:Scala00

alastor1729's Repositories

alastor1729/dbldatagen-forked
Generate relevant synthetic data quickly for your projects. The Databricks Labs synthetic data generator (aka `dbldatagen`) may be used to generate large simulated / synthetic data sets for test, POCs, and other uses in Databricks environments including in Delta Live Tables pipelines
Language:Python0 0 00
alastor1729/rockthejvm-spark-essentials-FORK
The official repository for the Rock the JVM Spark Essentials with Scala course
Language:Scala0 0 00
alastor1729/scala_pekko_streams_tutorial_FORK
A collection of runnable and self-contained examples inspired by various akka-streams (pekko-streams), Alpakka (Pekko connectors) and akka-http (pekko-http) docs, tutorials and blogs
Language:Scala0 0 00
alastor1729/spark
Apache Spark - A unified analytics engine for large-scale data processing
Language:Scala01
alastor1729/apache-druid-fork
Apache Druid: a high performance real-time analytics database.
Language:Java
alastor1729/apache-kafka-FORK
Mirror of Apache Kafka GitHub page
alastor1729/consulting-handbook-FORK
A guide for technical professionals looking to start consulting
alastor1729/data-engineer-handbook
This is a repo with links to everything you'd ever want to learn about data engineering
alastor1729/Databricks-Data-Engineer-Associate-Udemy-FORKED
The resources of the preparation course for Databricks Data Engineer Associate certification exam...
alastor1729/databricks-labs-mosaic-FORK
An extension to the Apache Spark framework that allows easy and fast processing of very large geospatial datasets.
Language:Jupyter Notebook0 0
alastor1729/dd_poker_source_code_FORK
DD Poker Source Code
Language:Java0 0
alastor1729/graph-rag-FORK
A modular graph-based Retrieval-Augmented Generation (RAG) system
Language:Python0 0
alastor1729/incubator-gluten-FORK
Gluten is a middle layer responsible for offloading JVM-based SQL engines' execution to native engines.
alastor1729/kyuubi-fork
Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses.
Language:Scala
alastor1729/llama-recipes_FORK
Scripts for fine-tuning Meta Llama3 with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supporting a number of candid inference solutions such as HF TGI, VLLM for local or cloud deployment. Demo apps to showcase Meta Llama3 for WhatsApp & Messenger.
Language:Jupyter Notebook
alastor1729/meta-llama_3_FORK
The official Meta Llama 3 GitHub site
alastor1729/metaflow-forked
:rocket: Build and manage real-life data science projects with ease!
Language:Python
alastor1729/mlops-on-gcp-FORKED
alastor1729/netflix-scala-atlas_FORK
In-memory dimensional time series database.
alastor1729/open-ai-evals-fork
Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.
Language:Python
alastor1729/openai-cookbook-FORK
Examples and guides for using the OpenAI API
Language:MDX0 0
alastor1729/openai-python-forked
The official Python library for the OpenAI API
Language:Python
alastor1729/overwatch-FORK-ToDo-2024
Capture deep metrics on one or all assets within a Databricks workspace
Language:Scala
alastor1729/pandas-py-forked
Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
Language:Python
alastor1729/Python-Implemented-Algorithms
All Algorithms implemented in Python
alastor1729/remorph-DBL-forked
Cross-compiler and Data Reconciler into Databricks Lakehouse
alastor1729/scala-mdoc-FORK
Typechecked markdown documentation for Scala
alastor1729/spark-nlp-FORK
State of the Art Natural Language Processing
Language:Scala0 0
alastor1729/system-design-primer-FORK
Learn how to design large-scale systems. Prep for the system design interview. Includes Anki flashcards.
Language:Python0 0
alastor1729/unity-catalog-FORK
Open, Multi-modal Catalog for Data & AI
Language:Java0 0