holdenk
Holden Karau is trans Canadian, and open source contributor. She is a Spark committer co-author of Learning Spark, High Performance Spark and Kubeflow for ML.
Open Source Big Data DevSan Francisco, CA, USA
holdenk's Stars
mastodon/mastodon
Your self-hosted, globally interconnected microblogging community
charlax/professional-programming
A collection of learning resources for curious software engineers
pingcap/tidb
TiDB - the open-source, cloud-native, distributed SQL database designed for modern applications.
togethercomputer/OpenChatKit
jzhang38/TinyLlama
The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.
treeverse/lakeFS
lakeFS - Data version control for your data lake | Git for data
JohnSnowLabs/spark-nlp
State of the Art Natural Language Processing
TimelyDataflow/timely-dataflow
A modular implementation of timely dataflow in Rust
bkerler/edl
Inofficial Qualcomm Firehose / Sahara / Streaming / Diag Tools :)
bytewax/bytewax
Python Stream Processing
pyspark-ai/pyspark-ai
English SDK for Apache Spark
mrpowers-io/quinn
pyspark methods to enhance developer productivity 📣 👯 🎉
kbressem/medAlpaca
LLM finetuned for medical question answering
Jeija/bluefluff
Reverse Engineering Furby Connect's Bluetooth Protocol and Update Format
michaelrsweet/lprint
A Label Printer Application
apache/spark-connect-go
Apache Spark Connect Client for Golang
Nike-Inc/spark-expectations
A Python Library to support running data quality rules while the spark job is running⚡
kendallgoto/switchbota
Replaces the factory firmware on the SwitchBot Plug Mini via OTA, enabling the use of Tasmota without disassembling the unit.
target/data-validator
A tool to validate data, built around Apache Spark.
filibuster-testing/filibuster
Prototype implementation of Service-Level Fault Injection Testing in Python.
yahoo/imapnio
Java imap nio client that is designed to scale well for thousands of connections per machine and reduce contention when using large number of threads and cpus.
howlrapp/howlr-app
Main repository for the Howlr application
yudataguy/RawRAG
Let's RAG it RAW without fancy frameworks
endreszabo/PowerDNS-Dynamic-Reverse-Backend
A PowerDNS pipe dynamic backend to serve dnswall style A, AAAA and PTR DNS records for any given CIDR ranges.
g588928812/bitsandbytes_jetsonX
8-bit CUDA functions for PyTorch, modified to build on Jetson Xavier
ajozwik/pekko-smtp-server
eblocha/django-encrypted-files
Encrypt files uploaded to a Django application.
konexios/moonstone
Open source version of Arrow Connect Platform developed by Arrow Electronics
feedernet/petnet-android
magicalpipelines/docker-ksql-multilingual-udfs-poc
A POC for multilingual UDFs in KSQL