Tauffer-Consulting/domino

New feature suggestions

Closed this issue · 7 comments

This is an impressive project. During my usage, I've been looking to encapsulate a workflow into a single piece. This would allow others to directly utilize this piece to construct more advanced workflows. Is this conceptually possible? Would you contemplate adding such a feature?

thanks for the suggestion @lanzhixi !
Indeed, this would be an awesome feature, we're thinking about ways to incorporate that in the future. Conceptually it is possible

Thank you for your affirmation. Do you have any ideas on how to implement this feature? I'd like to give it a try, although I can't guarantee success. Thank you!

This would be a complex feature to implement, definitely doable, but there are some moving parts that would have to come together:

  • wrap up an original Airflow operator such as the TriggerDagRunOperator as a DominoPiece, we could call it TriggerWorkflowPiece
  • a frontend representation of this TriggerWorkflowPiece. This would require a new category of data to be fetched and sent to the Forms, in this case it would be the list of available Workflows for that Workspace
  • Adapt the current backend system to run this Piece. I had started a branch to run traditional Airflow operators as Domino Pieces, this would in principle be able to deal with a feature like TriggerWorkflowPiece, but we haven’t yet figured out a way to implement it

references:

@lanzhixi you are of course welcome to try and contribute!

thank you!

Hello @luiztauffer, during this period, I have learned some front-end and back-end knowledge, as well as related content about Airflow. I found that task groups (https://airflow.apache.org/docs/apache-airflow/stable/core-concepts/dags.html#taskgroups) seem to be able to encapsulate workflows.

I roughly understand the process of front-end and back-end interaction. When the front-end clicks save, it creates a new workflow through the request /workspaces/{workspace_id}/workflows. The create_workflow function in workflow_router.py on the backend is called, thus generating a new workflow. In the process of creating a workflow on the backend, the _create_dag_code_from_raw_json function in workflow_service.py is called to convert the json in the body into dag code. I am thinking whether it's possible to add support for task groups in this function, so as to support the function of encapsulating workflows. Is my idea feasible?

Although I can now roughly understand the process, there is a big gap from actual hands-on, so I would like to ask you to help design the data format for transmission when the front-end and back-end interact, as well as the backend part. I can try the front-end part, thank you!!!

We really need this feature, thank you very much!!!

Hey @lanzhixi , nice you are digging Domino's code!

Your feature idea is indeed very interesting, and we definitely want to make this possible at some point. However, I believe that updating only the _create_dag_code_from_raw_json function in workflow_service.py may not be sufficient due to how the database is structured and how workflow creation and execution occur.

Let's break down the processes:

To create a workflow, the user views a list of installed pieces repositories in the selected workspace. Each repository contains one or more unique pieces.
When creating a workflow, the workflow_service generates a DAG file where each step of the workflow is a Domino Task. This Task is responsible for executing a specific piece unique to the context of the workspace to which the workflow belongs.

If we want to allow users to add pre-existing workflows, I believe we would need to:

  1. Modify the frontend to display available workflows that can be used as "pieces" in the new workflow, in addition to the default pieces from the pieces repositories.
  2. Define, in some way, the data input and output models of a workflow. Currently, there is no direct database-level relationship between workflows and the pieces to be executed. This information is only present in the workflow DAG file. When dealing with workflows "as pieces", we must consider both the input and output pieces of the workflow as the input and output data models. This is necessary for rendering the forms and allowing upstream connections to between data with the same types.
  3. Potentially update the database to include some level of relationship between workflows and pieces, or even create a new table for groups.
  4. Possibly update the Domino classes to handle the new data transmission design.
  5. All the changes I described are technically feasible, but they will require some engineering work across various parts of Domino, including REST, the Frontend, and the Domino Package.

If you are curious about the database design and the function of each class in the Domino package, you can refer to the documentation in the following sections:

Database: Domino Database Documentation
Package: Domino Python Package Documentation
REST: Domino REST Documentation

Also, if you feel there is any information missing from the documentation, please let us know and we will include it. The documentation is in the process of continuous development as well

We really appreciate your enthusiasm for this feature, I hope we can work together to make it happen!

Thank you for your support, I am working on it step by step.