The Customer Data Platform for Developers
Website · Documentation · Slack Community
The RudderStack Airflow Provider lets you programmatically schedule and trigger your Reverse ETL syncs from outside RudderStack and integrate them with your existing Airflow workflows.
For more information on using the Airflow Provider utility, refer to the documentation. |
---|
pip install rudderstack-airflow-provider
Note
Use RudderstackRETLOperator for reverse ETL connections
A simple DAG for triggering syncs for a RudderStack source:
with DAG(
'rudderstack-sample',
default_args=default_args,
description='A simple tutorial DAG',
schedule_interval=timedelta(days=1),
start_date=datetime(2021, 1, 1),
catchup=False,
tags=['rs']
) as dag:
rs_operator = RudderstackOperator(
source_id='<source-id>',
task_id='<any-task-id>',
connection_id='rudderstack_conn'
)
For the complete code, refer to this example.
Parameter | Description | Type | Default |
---|---|---|---|
source_id |
Valid RudderStack source ID | String | None |
task_id |
A unique task ID within a DAG | String | None |
wait_for_completion |
If True , the task will wait for sync to complete. |
Boolean | False |
connection_id |
The Airflow connection to use for connecting to the Rudderstack API. | String | rudderstack_default |
The RudderStack operator also supports all the parameters supported by the Airflow base operator.
For details on how to run the DAG in Airflow, refer to the documentation.
Trigger syncs for RETL connections
with DAG('rudderstack-sample',
default_args=default_args,
description='A simple tutorial DAG',
schedule_interval=timedelta(days=1),
start_date=datetime(2021, 1, 1),
catchup=False,
tags=['rs']) as dag:
rs_operator = RudderstackRETLOperator(
retl_connection_id='2aiDQzMqP6LNuUokWstmaubcZOP',
task_id='retl-test-sync',
connection_id='rudder_yeshwanth_dev',
sync_type='full',
wait_for_completion=True
)
Parameter | Description | Type | Default |
---|---|---|---|
retl_connection_id |
Valid RudderStack RETL connection ID | String (templatable) | None |
task_id |
A unique task ID within a DAG | String | None |
wait_for_completion |
If True , the task will wait for sync to complete. |
Boolean | False |
connection_id |
The Airflow connection to use for connecting to the Rudderstack API. | String | rudderstack_default |
sync_type |
Type of sync to trigger | incremental or full (templatable) |
incremental |
For details on how to run the DAG in Airflow, refer to the documentation.
We would love to see you contribute to this project. Get more information on how to contribute here.
The RudderStack Airflow Provider is released under the MIT License.
For more information or queries on this feature, you can contact us or start a conversation in our Slack community.