Flyte is a production-grade, container-native, type-safe workflow and pipelines platform optimized for large scale processing and machine learning written in Golang
Home Page · Quick Start · Documentation · Features · Community & Resources · Changelogs · Components
Flyte is a structured programming and distributed processing platform that enables highly concurrent, scalable and maintainable workflows for Machine Learning
and Data Processing
. It is a fabric that connects disparate computation backends using a type safe data dependency graph. It records all changes to a pipeline, making it possible to rewind time. It also stores
a history of all executions and provides an intuitive UI, CLI and REST/gRPC API to interact with the computation.
Flyte is more than a workflow engine -- it provides workflow
as a core concept and a single unit of execution called task
as a top level concept. Multiple tasks arranged in a data
producer-consumer order create a workflow.
Workflows
and Tasks
can be written in any language, with out of the box support for Python, Java and Scala.
With docker installed, run the following command:
docker run --rm --privileged -p 30081:30081 -p 30084:30084 ghcr.io/flyteorg/flyte-sandbox
This creates a local Flyte sandbox. Once the sandbox is ready, you should see the following message: Flyte is ready! Flyte UI is available at http://localhost:30081/console
.
Go ahead and visit http://localhost:30081/console to view the Flyte dashboard.
Here's a quick visual tour of the console.
To dig deeper into Flyte, refer to the Documentation.
- Used at Scale in production by 500+ users at Lyft with more than 1 million executions and 40+ million container executions per month
- Enables collaboration across your organization, as in:
- Execute distributed data pipelines/workflows
- Reuse tasks across projects, users, and workflows
- Backtrace to a specified workflow
- Compare results of training workflows over time and across pipelines
- Share workflows and tasks across your teams
- Quick registration -- start locally and scale to the cloud instantly
- Centralized Inventory constituting Tasks, Workflows and Executions
- gRPC / REST interface to define and execute tasks and workflows
- Type safe construction of pipelines -- each task has an interface which is characterized by its input and output; thus, illegal construction of pipelines fails during declaration rather than at runtime
- Supports multiple data types for machine learning and data processing pipelines, such as Blobs (images, arbitrary files), Directories, Schema (columnar structured data), collections, maps etc.
- Memoization and Lineage tracking
- Workflow features:
- Start with one task, convert to a pipeline, attach multiple schedules, trigger using a programmatic API, or on-demand
- Parallel step execution
- Extensible backend to add customized plugin experience (with simplified user experience)
- Branching
- Inline subworkflows (a workflow can be embeded within one node of the top level workflow)
- Distributed remote child workflows (a remote workflow can be triggered and statically verified at compile time)
- Array Tasks (map a function over a large dataset -- ensures controlled execution of thousands of containers)
- Dynamic workflow creation and execution with runtime type safety
- Container side plugins with first class support in Python
- PreAlpha: Arbitrary flytekit-less containers supported (RawContainer)
- Guaranteed reproducibility of pipelines via:
- Versioned data, code and models
- Automatically tracked executions
- Declarative pipelines
- Multi cloud support (AWS, GCP and others)
- Extensible core, modularized, and deep observability
- Automated notifications to Slack, Email, and Pagerduty
- Multi K8s cluster support
- Out of the box support to run Spark jobs on K8s, Hive queries, etc.
- Snappy Console
- Python CLI and Golang CLI (flytectl)
- Written in Golang and optimized for large running jobs' performance
- Grafana templates (user/system observability)
- Helm chart for Flyte
- Performance optimization
- Flink-K8s
- Containers
- K8s Pods
- AWS Batch Arrays
- K8s Pod Arrays
- K8s Spark (native Pyspark and Java/Scala)
- AWS Athena
- Qubole Hive
- Presto Queries
- Distributed Pytorch (K8s Native) -- Pytorch Operator
- Sagemaker(builtin algorithms & custom models)
- Distributed Tensorflow (K8s Native) - TFOperator
- Papermill notebook execution (Python and Spark)
- Type safe and data checking for Pandas dataframe using Pandera
- Reactive pipelines
- A lot more integrations!
Repo | Language | Purpose | Status |
---|---|---|---|
flyte | Kustomize,RST | deployment, documentation, issues | Production-grade |
flyteidl | Protobuf | interface definitions | Production-grade |
flytepropeller | Go | execution engine | Production-grade |
flyteadmin | Go | control plane | Production-grade |
flytekit | Python | python SDK and tools | Production-grade |
flyteconsole | Typescript | admin console | Production-grade |
datacatalog | Go | manage input & output artifacts | Production-grade |
flyteplugins | Go | flyte plugins | Production-grade |
flytestdlib | Go | standard library | Production-grade |
flytesnacks | Python | examples, tips, and tricks | Incubating |
flytekit-java | Java/Scala | Java & scala SDK for authoring Flyte workflows | Incubating |
flytectl | Go | A standalone Flyte CLI | Incomplete |
Repo | Language | Purpose |
---|---|---|
Spark | Go | Apache Spark batch |
Flink | Go | Apache Flink streaming |
Here are the resources that would help you get a better understanding of Flyte.
- 📣 Flyte OSS Community Sync happens every alternate Tuesday, 9am-10am PDT (Checkout the events calendar & subscribe). Here's the zoom link.
- Meeting notes and backlog of topics are captured in doc.
- If you'd like to revisit any community sync meeting that has happened, you can access the video recordings.
- Kubecon 2019 - Flyte: Cloud Native Machine Learning and Data Processing Platform video | deck
- Kubecon 2019 - Running LargeScale Stateful workloads on Kubernetes at Lyft video
- re:invent 2019 - Implementing ML workflows with Kubernetes and Amazon Sagemaker video
- Cloud-native machine learning at Lyft with AWS Batch and Amazon EKS video
- OSS + ELC NA 2020 splash
- Datacouncil splash
- FB AI@Scale Making MLOps & DataOps a reality
- GAIC 2020
- Introducing Flyte: A Cloud Native Machine Learning and Data Processing Platform
- Building a Gateway to Flyte
- TWIML&AI - Scalable and Maintainable ML Workflows at Lyft - Flyte
- Software Engineering Daily - Flyte: Lyft Data Processing Platform
- MLOps Coffee session - Flyte: an open-source tool for scalable, extensible, and portable workflows
A big thank you to the community for making Flyte possible!
- @wild-endeavor
- @katrogan
- @EngHabu
- @akhurana001
- @anandswaminathan
- @kanterov
- @honnix
- @jeevb
- @jonathanburns
- @migueltol22
- @varshaparthay
- @pingsutw
- @narape
- @lu4nm3
- @bnsblue
- @RubenBarragan
- @schottra
- @evalsocket
- @matthewphsmith
- @slai
- @derwiki
- @tnsetting
- @jbrambleDC
- @igorvalko
- @chanadian
- @surindersinghp
- @vsbus
- @catalinii
- @kumare3