/flyte

Accelerate your ML and Data workflows to production. Flyte is a production grade orchestration system for your Data and ML workloads. It has been battle tested at Lyft, Spotify, freenome and others and truly open-source.

Primary LanguagePythonApache License 2.0Apache-2.0

Flyte and LF AI & Data Logo

Flyte

Flyte is a production-grade, container-native, type-safe workflow and pipelines platform optimized for large scale processing and machine learning written in Golang

Current Release Sandbox Build End-to-End Tests License Commit Activity Commits since Last Release GitHub Milestones Completed GitHub Next Milestone Percentage Docs Twitter Follow Slack Status

💥 Introduction

Flyte is a structured programming and distributed processing platform that enables highly concurrent, scalable and maintainable workflows for Machine Learning and Data Processing. It is a fabric that connects disparate computation backends using a type safe data dependency graph. It records all changes to a pipeline, making it possible to rewind time. It also stores a history of all executions and provides an intuitive UI, CLI and REST/gRPC API to interact with the computation.

Flyte is more than a workflow engine -- it provides workflow as a core concept and a single unit of execution called task as a top level concept. Multiple tasks arranged in a data producer-consumer order create a workflow.

Workflows and Tasks can be written in any language, with out of the box support for Python, Java and Scala.

🚀 Quick Start

With docker installed, run the following command:

  docker run --rm --privileged -p 30081:30081 -p 30084:30084 ghcr.io/flyteorg/flyte-sandbox

This creates a local Flyte sandbox. Once the sandbox is ready, you should see the following message: Flyte is ready! Flyte UI is available at http://localhost:30081/console.

Go ahead and visit http://localhost:30081/console to view the Flyte dashboard.

Here's a quick visual tour of the console.

Flyte console Example

To dig deeper into Flyte, refer to the Documentation.

⭐️ Current Deployments

🔥 Features

  • Used at Scale in production by 500+ users at Lyft with more than 1 million executions and 40+ million container executions per month
  • Enables collaboration across your organization, as in:
    • Execute distributed data pipelines/workflows
    • Reuse tasks across projects, users, and workflows
    • Backtrace to a specified workflow
    • Compare results of training workflows over time and across pipelines
    • Share workflows and tasks across your teams
  • Quick registration -- start locally and scale to the cloud instantly
  • Centralized Inventory constituting Tasks, Workflows and Executions
  • gRPC / REST interface to define and execute tasks and workflows
  • Type safe construction of pipelines -- each task has an interface which is characterized by its input and output; thus, illegal construction of pipelines fails during declaration rather than at runtime
  • Supports multiple data types for machine learning and data processing pipelines, such as Blobs (images, arbitrary files), Directories, Schema (columnar structured data), collections, maps etc.
  • Memoization and Lineage tracking
  • Workflow features:
    • Start with one task, convert to a pipeline, attach multiple schedules, trigger using a programmatic API, or on-demand
    • Parallel step execution
    • Extensible backend to add customized plugin experience (with simplified user experience)
    • Branching
    • Inline subworkflows (a workflow can be embeded within one node of the top level workflow)
    • Distributed remote child workflows (a remote workflow can be triggered and statically verified at compile time)
    • Array Tasks (map a function over a large dataset -- ensures controlled execution of thousands of containers)
    • Dynamic workflow creation and execution with runtime type safety
    • Container side plugins with first class support in Python
    • PreAlpha: Arbitrary flytekit-less containers supported (RawContainer)
  • Guaranteed reproducibility of pipelines via:
    • Versioned data, code and models
    • Automatically tracked executions
    • Declarative pipelines
  • Multi cloud support (AWS, GCP and others)
  • Extensible core, modularized, and deep observability
  • Automated notifications to Slack, Email, and Pagerduty
  • Multi K8s cluster support
  • Out of the box support to run Spark jobs on K8s, Hive queries, etc.
  • Snappy Console
  • Python CLI and Golang CLI (flytectl)
  • Written in Golang and optimized for large running jobs' performance

In Progress

  • Grafana templates (user/system observability)
  • Helm chart for Flyte
  • Performance optimization
  • Flink-K8s

🔌 Available Plugins

In Queue

  • Reactive pipelines
  • A lot more integrations!

📦 Component Repos

Repo Language Purpose Status
flyte Kustomize,RST deployment, documentation, issues Production-grade
flyteidl Protobuf interface definitions Production-grade
flytepropeller Go execution engine Production-grade
flyteadmin Go control plane Production-grade
flytekit Python python SDK and tools Production-grade
flyteconsole Typescript admin console Production-grade
datacatalog Go manage input & output artifacts Production-grade
flyteplugins Go flyte plugins Production-grade
flytestdlib Go standard library Production-grade
flytesnacks Python examples, tips, and tricks Incubating
flytekit-java Java/Scala Java & scala SDK for authoring Flyte workflows Incubating
flytectl Go A standalone Flyte CLI Incomplete

🔩 Production K8s Operators

Repo Language Purpose
Spark Go Apache Spark batch
Flink Go Apache Flink streaming

🤝 Community & Resources

Here are the resources that would help you get a better understanding of Flyte.

Communication Channels

Biweekly Community Sync

Conference Talks

  • Kubecon 2019 - Flyte: Cloud Native Machine Learning and Data Processing Platform video | deck
  • Kubecon 2019 - Running LargeScale Stateful workloads on Kubernetes at Lyft video
  • re:invent 2019 - Implementing ML workflows with Kubernetes and Amazon Sagemaker video
  • Cloud-native machine learning at Lyft with AWS Batch and Amazon EKS video
  • OSS + ELC NA 2020 splash
  • Datacouncil splash
  • FB AI@Scale Making MLOps & DataOps a reality
  • GAIC 2020

Blog Posts

  1. Introducing Flyte: A Cloud Native Machine Learning and Data Processing Platform
  2. Building a Gateway to Flyte

Podcasts

💖 Top Contributors

A big thank you to the community for making Flyte possible!