/kestra

Kestra is an infinitely scalable orchestration and scheduling platform, creating, running, scheduling, and monitoring millions of complex pipelines.

Primary LanguageJavaApache License 2.0Apache-2.0

Kestra workflow orchestrator

Infinitely scalable open source orchestration & scheduling platform.

License Commits-per-month Github star Last Version Docker pull Artifact Hub Kestra infinitely scalable orchestration and scheduling platform Slack Github discussions Twitter Code Cov Github Actions

WebsiteTwitterLinked InSlackDocumentation


modern data orchestration and scheduling platform

Demo

Play with our demo app!

What is Kestra ?

Kestra is an infinitely scalable orchestration and scheduling platform, creating, running, scheduling, and monitoring millions of complex pipelines.

  • 🔀 Any kind of workflow: Workflows can start simple and progress to more complex systems with branching, parallel, dynamic tasks, flow dependencies
  • 🎓‍ Easy to learn: Flows are in simple, descriptive language defined in YAML—you don't need to be a developer to create a new flow.
  • 🔣 Easy to extend: Plugins are everywhere in Kestra, many are available from the Kestra core team, but you can create one easily.
  • 🆙 Any triggers: Kestra is event-based at heart—you can trigger an execution from API, schedule, detection, events
  • 💻 A rich user interface: The built-in web interface allows you to create, run, and monitor all your flows—no need to deploy your flows, just edit them.
  • Enjoy infinite scalability: Kestra is built around top cloud native technologies—scale to millions of executions stress-free.

Example flow:

id: my-first-flow
namespace: my.company.teams

inputs:
  - type: FILE
    name: uploaded
    description: A Csv file to be uploaded through API or UI

tasks:
  - id: archive
    type: io.kestra.plugin.gcp.gcs.Upload
    description: Archive the file on Google Cloud Storage bucket
    from: "{{ inputs.uploaded }}"
    to: "gs://my_bucket/archives/{{ execution.id }}.csv"

  - id: csvReader
    type: io.kestra.plugin.serdes.csv.CsvReader
    from: "{{ inputs.uploaded }}"

  - id: fileTransform
    type: io.kestra.plugin.scripts.nashorn.FileTransform
    description: This task will anonymize the contactName with a custom nashorn script (javascript over jvm). This show that you able to handle custom transformation or remapping in the ETL way
    from: "{{ outputs.csvReader.uri }}"
    script: |
      if (row['contactName']) {
        row['contactName'] = "*".repeat(row['contactName'].length);
      }

  - id: avroWriter
    type: io.kestra.plugin.serdes.avro.AvroWriter
    description: This file will convert the file from Kestra internal storage to avro. Again, we handling ETL since the conversion is done by Kestra before loading the data in BigQuery. This allow you to have some control before loading and to reject wrong data as soon as possible.
    from: "{{ outputs.fileTransform.uri }}"
    schema: |
      {
        "type": "record",
        "name": "Root",
        "fields":
          [
            { "name": "contactTitle", "type": ["null", "string"] },
            { "name": "postalCode", "type": ["null", "long"] },
            { "name": "entityId", "type": ["null", "long"] },
            { "name": "country", "type": ["null", "string"] },
            { "name": "region", "type": ["null", "string"] },
            { "name": "address", "type": ["null", "string"] },
            { "name": "fax", "type": ["null", "string"] },
            { "name": "email", "type": ["null", "string"] },
            { "name": "mobile", "type": ["null", "string"] },
            { "name": "companyName", "type": ["null", "string"] },
            { "name": "contactName", "type": ["null", "string"] },
            { "name": "phone", "type": ["null", "string"] },
            { "name": "city", "type": ["null", "string"] }
          ]
      }

  - id: load
    type: io.kestra.plugin.gcp.bigquery.Load
    description: Simply load the generated from avro task to BigQuery
    avroOptions:
      useAvroLogicalTypes: true
    destinationTable: kestra-prd.demo.customer_copy
    format: AVRO
    from: "{{outputs.avroWriter.uri }}"
    writeDisposition: WRITE_TRUNCATE

  - id: aggregate
    type: io.kestra.plugin.gcp.bigquery.Query
    description: Aggregate some data from loaded files
    createDisposition: CREATE_IF_NEEDED
    destinationTable: kestra-prd.demo.agg
    sql: |
      SELECT k.categoryName, p.productName, c.companyName, s.orderDate, SUM(d.quantity) AS quantity, SUM(d.unitPrice * d.quantity * r.exchange) as totalEur
      FROM `kestra-prd.demo.salesOrder` AS s
      INNER JOIN `kestra-prd.demo.orderDetail` AS d ON s.entityId = d.orderId
      INNER JOIN `kestra-prd.demo.customer` AS c ON c.entityId = s.customerId
      INNER JOIN `kestra-prd.demo.product` AS p ON p.entityId = d.productId
      INNER JOIN `kestra-prd.demo.category` AS k ON k.entityId = p.categoryId
      INNER JOIN `kestra-prd.demo.rates` AS r ON r.date = DATE(s.orderDate) AND r.currency = "USD"
      GROUP BY 1, 2, 3, 4
    timePartitioningField: orderDate
    writeDisposition: WRITE_TRUNCATE

Getting Started

To get a local copy up and running, please follow these steps.

Prerequisites

Make sure you have already installed:

Launch Kestra

Plugins

Kestra is built on a plugin system. You can find your plugin to interact with your provider; alternatively, you can follow these steps to develop your own plugin. Here are the official plugins that are available:

Airbyte Amazon S3 Avro
Azure Blob Storage Bash Big Query
CSV Cassandra ClickHouse
DBT Debezium MYSQL Debezium Postgres
Debezium Microsoft SQL Server DuckDb ElasticSearch
Fivetran Email FTP
FTPS Google Cloud Storage Google Drive
Google Sheets Groovy Http
JSON Jython Kafka
Kubernetes MQTT Microsoft SQL Server
MongoDb MySQL Nashorn
Node Open PGP Oracle
Parquet Apache Pinot Postgres
Power BI Apache Pulsar Python
Redshift Rockset SFTP
ServiceNow Singer Slack
Snowflake Soda Spark
Tika Trino Vectorwise
XML Vertex AI Vertica

This list is growing quickly as we are actively building more plugins, and we welcome contributions!

Community Support

Join our community if you need help, want to chat, or have any other questions for us:

  • GitHub - Discussion forums and updates from the Kestra team
  • Twitter - For all the latest Kestra news
  • Slack - Join the conversation! Get all the latest updates and chat with the devs

Roadmap

See the open issues for a list of proposed features (and known issues) or look at the project board.

Developing locally & Contributing

We love contributions big or small, check out our guide on how to get started.

See our Plugin Developer Guide for developing Kestra plugins.

License

Apache 2.0 © Kestra Technologies