spark-sql
There are 786 repositories under spark-sql topic.
getredash/redash
Make Your Company Data Driven. Connect to any data source, easily visualize, dashboard and share your data.
apache/kyuubi
Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses.
dotnet/spark
.NET for Apache® Spark™ makes Apache Spark™ easily accessible to .NET developers.
almond-sh/almond
A Scala kernel for Jupyter
apache/incubator-gluten
Gluten is a middle layer responsible for offloading JVM-based SQL engines' execution to native engines.
databricks/LearningSparkV2
This is the github repo for Learning Spark: Lightning-Fast Data Analytics [2nd Edition]
oeljeklaus-you/UserActionAnalyzePlatform
电商用户行为分析大数据平台
ploomber/jupysql
Better SQL in Jupyter. 📊
qubole/sparklens
Qubole Sparklens tool for performance tuning Apache Spark
japila-books/spark-sql-internals
The Internals of Spark SQL
kevinschaich/pyspark-cheatsheet
🐍 Quick reference guide to common patterns & functions in PySpark.
zsvoboda/ngods-stocks
New Generation Opensource Data Stack Demo
microsoft/data-accelerator
Data Accelerator for Apache Spark simplifies onboarding to Streaming of Big Data. It offers a rich, easy to use experience to help with creation, editing and management of Spark jobs on Azure HDInsights or Databricks while enabling the full power of the Spark engine.
cuebook/cuelake
Use SQL to build ELT pipelines on a data lakehouse.
jaceklaskowski/spark-workshop
Apache Spark™ and Scala Workshops
Qbeast-io/qbeast-spark
Qbeast-spark: DataSource enabling multi-dimensional indexing and efficient data sampling. Big Data, free from the unnecessary!
Chabane/bigdata-playground
A complete example of a big data application using : Kubernetes (kops/aws), Apache Spark SQL/Streaming/MLib, Apache Flink, Scala, Python, Apache Kafka, Apache Hbase, Apache Parquet, Apache Avro, Apache Storm, Twitter Api, MongoDB, NodeJS, Angular, GraphQL
bluishglc/bdp
A prototype project of big data platform, the source codes of the book Big Data Platform Architecture and Prototype
polomarcus/Spark-Structured-Streaming-Examples
Spark Structured Streaming / Kafka / Cassandra / Elastic
mc2-project/opaque-sql
An encrypted data analytics platform
xiaogp/recsys_spark
Spark SQL 实现 ItemCF,UserCF,Swing,推荐系统,推荐算法,协同过滤
LearningJournal/Spark-Streaming-In-Python
Apache Spark 3 - Structured Streaming Course Material
wangj1106/recommendMoteur
电影推荐系统、电影推荐引擎、使用Spark完成的电影推荐引擎
streamnative/pulsar-spark
Spark Connector to read and write with Pulsar
izhangzhihao/Real-time-Data-Warehouse
Real-time Data Warehouse with Apache Flink & Apache Kafka & Apache Hudi
minio/spark-select
A library for Spark DataFrame using MinIO Select API
Thomas-George-T/Movies-Analytics-in-Spark-and-Scala
Data cleaning, pre-processing, and Analytics on a million movies using Spark and Scala.
martandsingh/ApacheSpark
This repository will help you to learn about databricks concept with the help of examples. It will include all the important topics which we need in our real life experience as a data engineer. We will be using pyspark & sparksql for the development. At the end of the course we also cover few case studies.
sjrusso8/spark-connect-rs
Apache Spark Connect Client for Rust
LearningJournal/SparkProgrammingInScala
Apache Spark Course Material
huangyueranbbc/SparkDemo
spark全示例代码(java、scala) Spark most full instance code DEMO (java、scala)
streamnative/awesome-pulsar
A curated list of Pulsar tools, integrations and resources.
groda/big_data
Tutorials on Big Data essentials: Hadoop, MapReduce, Spark.
zsvoboda/ngods
New generation opensource data stack
harryprince/geospark
bring sf to spark in production
kaantas/spark-twitter-sentiment-analysis
Sentiment Analysis of a Twitter Topic with Spark Structured Streaming