Flyte

Flyte is a production-grade, container-native, type-safe workflow and pipelines platform optimized for large scale processing and machine learning written in Golang

Home Page · Quick Start · Documentation · Features · Community & Resources · Changelogs · Components

💥 Introduction

Flyte is a structured programming and distributed processing platform that enables highly concurrent, scalable and maintainable workflows for Machine Learning and Data Processing. It is a fabric that connects disparate computation backends using a type safe data dependency graph. It records all changes to a pipeline, making it possible to rewind time. It also stores a history of all executions and provides an intuitive UI, CLI and REST/gRPC API to interact with the computation.

Flyte is more than a workflow engine -- it provides workflow as a core concept and a single unit of execution called task as a top level concept. Multiple tasks arranged in a data producer-consumer order create a workflow.

Workflows and Tasks can be written in any language, with out of the box support for Python, Java and Scala.

🚀 Quick Start

With docker installed, run the following command:

  docker run --rm --privileged -p 30081:30081 -p 30084:30084 ghcr.io/flyteorg/flyte-sandbox

This creates a local Flyte sandbox. Once the sandbox is ready, you should see the following message: Flyte is ready! Flyte UI is available at http://localhost:30081/console.

Go ahead and visit http://localhost:30081/console to view the Flyte dashboard.

Here's a quick visual tour of the console.

To dig deeper into Flyte, refer to the Documentation.

⭐️ Current Deployments

🔥 Features

Used at Scale in production by 500+ users at Lyft with more than 1 million executions and 40+ million container executions per month
Enables collaboration across your organization, as in:
- Execute distributed data pipelines/workflows
- Reuse tasks across projects, users, and workflows
- Backtrace to a specified workflow
- Compare results of training workflows over time and across pipelines
- Share workflows and tasks across your teams
Quick registration -- start locally and scale to the cloud instantly
Centralized Inventory constituting Tasks, Workflows and Executions
gRPC / REST interface to define and execute tasks and workflows
Type safe construction of pipelines -- each task has an interface which is characterized by its input and output; thus, illegal construction of pipelines fails during declaration rather than at runtime
Supports multiple data types for machine learning and data processing pipelines, such as Blobs (images, arbitrary files), Directories, Schema (columnar structured data), collections, maps etc.
Memoization and Lineage tracking
Workflow features:
- Start with one task, convert to a pipeline, attach multiple schedules, trigger using a programmatic API, or on-demand
- Parallel step execution
- Extensible backend to add customized plugin experience (with simplified user experience)
- Branching
- Inline subworkflows (a workflow can be embeded within one node of the top level workflow)
- Distributed remote child workflows (a remote workflow can be triggered and statically verified at compile time)
- Array Tasks (map a function over a large dataset -- ensures controlled execution of thousands of containers)
- Dynamic workflow creation and execution with runtime type safety
- Container side plugins with first class support in Python
- PreAlpha: Arbitrary flytekit-less containers supported (RawContainer)
Guaranteed reproducibility of pipelines via:
- Versioned data, code and models
- Automatically tracked executions
- Declarative pipelines
Multi cloud support (AWS, GCP and others)
Extensible core, modularized, and deep observability
Automated notifications to Slack, Email, and Pagerduty
Multi K8s cluster support
Out of the box support to run Spark jobs on K8s, Hive queries, etc.
Snappy Console
Python CLI and Golang CLI (flytectl)
Written in Golang and optimized for large running jobs' performance

In Progress

Grafana templates (user/system observability)
Helm chart for Flyte
Performance optimization
Flink-K8s

🔌 Available Plugins

Containers
K8s Pods
AWS Batch Arrays
K8s Pod Arrays
K8s Spark (native Pyspark and Java/Scala)
AWS Athena
Qubole Hive
Presto Queries
Distributed Pytorch (K8s Native) -- Pytorch Operator
Sagemaker(builtin algorithms & custom models)
Distributed Tensorflow (K8s Native) - TFOperator
Papermill notebook execution (Python and Spark)
Type safe and data checking for Pandas dataframe using Pandera

In Queue

Reactive pipelines
A lot more integrations!

📦 Component Repos

Repo	Language	Purpose	Status
flyte	Kustomize,RST	deployment, documentation, issues	Production-grade
flyteidl	Protobuf	interface definitions	Production-grade
flytepropeller	Go	execution engine	Production-grade
flyteadmin	Go	control plane	Production-grade
flytekit	Python	python SDK and tools	Production-grade
flyteconsole	Typescript	admin console	Production-grade
datacatalog	Go	manage input & output artifacts	Production-grade
flyteplugins	Go	flyte plugins	Production-grade
flytestdlib	Go	standard library	Production-grade
flytesnacks	Python	examples, tips, and tricks	Incubating
flytekit-java	Java/Scala	Java & scala SDK for authoring Flyte workflows	Incubating
flytectl	Go	A standalone Flyte CLI	Incomplete

🔩 Production K8s Operators

Repo	Language	Purpose
Spark	Go	Apache Spark batch
Flink	Go	Apache Flink streaming

🤝 Community & Resources

Here are the resources that would help you get a better understanding of Flyte.

Communication Channels

Biweekly Community Sync

📣 Flyte OSS Community Sync happens every alternate Tuesday, 9am-10am PDT (Checkout the events calendar & subscribe). Here's the zoom link.
Meeting notes and backlog of topics are captured in doc.
If you'd like to revisit any community sync meeting that has happened, you can access the video recordings.

Conference Talks

Kubecon 2019 - Flyte: Cloud Native Machine Learning and Data Processing Platform video | deck
Kubecon 2019 - Running LargeScale Stateful workloads on Kubernetes at Lyft video
re:invent 2019 - Implementing ML workflows with Kubernetes and Amazon Sagemaker video
Cloud-native machine learning at Lyft with AWS Batch and Amazon EKS video
OSS + ELC NA 2020 splash
Datacouncil splash
FB AI@Scale Making MLOps & DataOps a reality
GAIC 2020

Blog Posts

Podcasts

TWIML&AI - Scalable and Maintainable ML Workflows at Lyft - Flyte
Software Engineering Daily - Flyte: Lyft Data Processing Platform
MLOps Coffee session - Flyte: an open-source tool for scalable, extensible, and portable workflows

💖 Top Contributors

A big thank you to the community for making Flyte possible!

samhita-alla/flyte