lxynov's Stars
awesomedata/awesome-public-datasets
A topic-centric list of HQ open datasets.
minio/minio
MinIO is a high-performance, S3 compatible object store, open sourced under GNU AGPLv3 license.
ray-project/ray
Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
zsh-users/zsh-syntax-highlighting
Fish shell like syntax highlighting for Zsh.
mlflow/mlflow
Open source platform for the machine learning lifecycle
airbytehq/airbyte
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
helm/charts
⚠️(OBSOLETE) Curated applications for Kubernetes
open-policy-agent/opa
Open Policy Agent (OPA) is an open source, general-purpose policy engine.
crossplane/crossplane
The Cloud Native Control Plane
spinnaker/spinnaker
Spinnaker is an open source, multi-cloud continuous delivery platform for releasing software changes with high velocity and confidence.
bitnami/charts
Bitnami Helm Charts
bentoml/BentoML
The easiest way to serve AI apps and models - Build Model Inference APIs, Job queues, LLM apps, Multi-model pipelines, and more!
lancedb/lancedb
Developer-friendly, serverless vector database for AI applications. Easily add long-term memory to your LLM apps!
lancedb/lance
Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckDB, Polars, Pyarrow, and PyTorch with more integrations coming..
facebookincubator/velox
A composable and fully extensible C++ execution engine library for data management systems.
apache/kyuubi
Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses.
uber-common/jvm-profiler
JVM Profiler Sending Metrics to Kafka, Console Output or Custom Reporter
Intel-bigdata/HiBench
HiBench is a big data benchmark suite.
apache/incubator-gluten
Gluten is a middle layer responsible for offloading JVM-based SQL engines' execution to native engines.
substrait-io/substrait
A cross platform way to express data transformation, relational algebra, standardized record expression and plans.
apache/ranger
Apache Ranger - To enable, monitor and manage comprehensive data security across the Hadoop platform and beyond
apache/incubator-livy
Apache Livy is an open source REST interface for interacting with Apache Spark from anywhere.
aws/aws-graviton-getting-started
Helping developers to use AWS Graviton2, Graviton3, and Graviton4 processors which power the 6th, 7th, and 8th generation of Amazon EC2 instances (C6g[d], M6g[d], R6g[d], T4g, X2gd, C6gn, I4g, Im4gn, Is4gen, G5g, C7g[d][n], M7g[d], R7g[d], R8g).
trinodb/trino-python-client
Python client for Trino
apple/batch-processing-gateway
The gateway component to make Spark on K8s much easier for Spark users.
IBM/spark-tpc-ds-performance-test
Use the TPC-DS benchmark to test Spark SQL performance
martint/jmxutils
Exporting JMX mbeans made easy
Lewuathe/docker-trino-cluster
Multiple node presto cluster on docker container
DataDog/jmxfetch
Export JMX metrics
datapunchorg/punch
This project provides fully automated one-click experience to create Cloud and Kubernetes environment to run Data Analytics workload like Apache Spark.