vihag's Stars
unitycatalog/unitycatalog
Open, Multi-modal Catalog for Data & AI
databrickslabs/brickster
R Toolkit for Databricks
PacktPublishing/Business-Intelligence-with-Databricks-SQL-Analytics
Business Intelligence with Databricks SQL Analytics, published by Packt
databricks/terraform-provider-databricks
Databricks Terraform Provider
mrchristine/db_jobs_janitor
Automated job to cleanup the job definitions in Databricks using AWS Lambda
mrchristine/db_cluster_janitor
Cluster cleanup for Databricks demo environments
JohnSnowLabs/spark-nlp
State of the Art Natural Language Processing
joelcthomas/modeldrift
Capturing model drift and handling its response - Example webinar
maxpumperla/elephas
Distributed Deep learning with Keras & Spark
awslabs/deequ
Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.
qubole/mlflow
Open source platform for the machine learning lifecycle
ageron/handson-ml
⛔️ DEPRECATED – See https://github.com/ageron/handson-ml3 instead.
tfutils/tfenv
Terraform version manager
big-data-europe/docker-hive
big-data-europe/docker-hive-metastore-postgresql
Postgresql configured to work as metastore for Hive.
qubole/customer-success
teamclairvoyant/airflow-rest-api-plugin
A plugin for Apache Airflow that exposes rest end points for the Command Line Interfaces
jaihind213/druid_stuff
some druid stuff
ankitdixit/prestoeventlistener
Implementation to collect queryInfo in S3 using presto event listener
hiyer/qubole-terraform
Terraform templates for setting up Qubole-ready environments
mlflow/mlflow-apps
MLflow App Library
qubole/sparklens
Qubole Sparklens tool for performance tuning Apache Spark
JanusGraph/janusgraph
JanusGraph: an open-source, distributed graph database
japila-books/apache-spark-internals
The Internals of Apache Spark
mahmoudparsian/data-algorithms-book
MapReduce, Spark, Java, and Scala for Data Algorithms Book