There are lot of resources online explaining the use cases of Luigi, but not a lot of them explain how to setup and configure Luigi. Here, we have different Luigi tasks explaining the use cases and setup
Only package we need is luigi
$ pip install luigi
- Simple Hello World task - Simple example to show the structure of a Luigi task
- Task with dependecy - Example to show how to add dependencies to current task
- Task with mock target - I really do not have an output, but still want some task dependency
- PySpark Task - Example pyspark task with JDBC connection
Every task mandatorily needs to have an output, else luigi does not know if the task is completed or not. It is more like a book keeping for Luigi to find out the status of the task.
If you remove output()
method from any of the dependant task, the dependant task will run infinitely. So make sure you have the output method specified and is used to write output in all tasks.
To run any task, we follow the below command pattern
$ luigi --module mymodule MyTask --local-scheduler
Example:
$ luigi --module dependant_task DependantTask --local-scheduler
Luigi looks for config files in:
/etc/luigi/client.cfg
luigi.cfg
in the current working directoryLUIGI_CONFIG_PATH
environment variable
Most important part of the configuration is setting up a spark job or a pyspark job. Luigi config has sections [spark]
and [pyspark]
to specify extra JARs and drivers required to run the spark job.
The --local-scheduler
param to run the luigi module must be used only during
development. Once deployed, we need to use the central scheduler.
Running the central scheduler:
$ luigid
When luigid starts up, it looks for the config file in previously mentioned locations.
Running a luigi task with the central scheduler
$ luigi --module pyspark_task PySparkTableSchema
Luigi comes with a web dashboard for task history and statuses. The dashboard can
be accessed via http://localhost:8082
for local instance. The central server's
dashboard can be accessed via http://<central-scheduler-host>:8082