StarTree Examples Repository: Exploring Apache Pinot

Welcome to the StarTree Examples Repository for all things Apache Pinot! This repository is designed to provide you with a collection of practical examples and use cases that showcase the capabilities of Apache Pinot, an open-source distributed storage and query engine built for real-time analytics.

What is Apache Pinot?

Apache Pinot is a powerful data analytics platform designed to handle large-scale, real-time data ingestion and querying. It's specifically optimized for use cases where low-latency and high-throughput querying of time-series data is crucial. Pinot is suitable for various applications, including monitoring, logging, recommendation engines, and more.

Repository Contents

In this repository, you'll find a variety of examples that demonstrate how to utilize Apache Pinot effectively. These examples cover a wide range of topics, including:

Data Ingestion: Learn how to efficiently ingest data into Apache Pinot from different sources.
Schema Design: Explore best practices for designing schemas that optimize query performance.
Real-time Queries: Dive into real-time query examples, showcasing Pinot's speed and responsiveness.
Batch Processing: Discover how to run batch processing jobs using Pinot to update data.
Integration with Big Data Ecosystem: See how Pinot can be integrated with other tools in the big data ecosystem.
Scaling and High Availability: Explore strategies for scaling and ensuring high availability of Pinot clusters.
Monitoring and Management: Learn about monitoring, managing, and maintaining Apache Pinot deployments.

How to Use This Repository

Each example in this repository comes with detailed documentation, code snippets, and step-by-step guides to help you follow along. You can clone this repository to your local environment and explore the examples at your own pace. Whether you're new to Apache Pinot or looking to expand your knowledge, these examples will provide valuable insights and hands-on experience.

Getting Started

To get started, simply navigate to the example of your choice and follow the instructions provided in the README files. Whether you're a data engineer, data scientist, or software developer, these examples will help you harness the power of Apache Pinot for your real-time data analytics needs.

Start exploring the examples and discover the capabilities of Apache Pinot today!

If you have any questions, feedback, or suggestions, feel free to open an issue in the repository. Happy exploring!

Repositories

Example	Description
Flink + Pinot	Learn how to integrate Apache Flink and Apache Pinot to build a real-time streaming pipeline with real-time analytics. We setup Postgres with sample data and enable it with CDC using Ververica's Postgres CDC connector.
Perf Testing Apache Pinot with Gatling	This example aims to guide you through the process of performance testing Apache Pinot using Gatling, a widely-used load testing tool. We'll explore the key concepts, methodologies, and best practices for assessing the performance and scalability of your Pinot cluster.
Wikipedia	Capturing Wikipedia page changes in Pinot. This example uses a realtime UPSERT table and ingestion transformations in Pinot. You can quickly viualize the data in a Jupyter Notebook.

Pinot Recipes

Additional Pinot Recipes can be found here