branching pipeline in YAML
droumis opened this issue · 2 comments
ALL software version info
lumen 0.5.0a64
Description of expected behavior and the observed behavior
I'm trying to branch a pipeline in YAML. I may be misunderstanding the API, but I would have expected the code below to run. The traceback is complaining about the pipeline branch ('branch_sort') not having a 'source'... but it should be inheriting the source from the pipeline that it's branching from.
Complete, minimal, self-contained example code that reproduces the issue
sources:
penguin_source:
type: file
tables:
penguin_table: https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-07-28/penguins.csv
pipelines:
penguin_pipeline:
source: penguin_source
table: penguin_table
filters:
- type: widget
field: island
branch_sort:
pipeline: penguin_pipeline
transforms:
- type: columns
columns: ['species', 'island', 'bill_length_mm', 'bill_depth_mm']
targets:
- title: Penguins
sizing_mode: stretch_width
views:
- type: table
pipeline: penguin_pipeline
show_index: false
height: 300
- type: table
pipeline: branch_sort
show_index: false
height: 300
Stack traceback
lumen serve penguin.yml
KeyError: 'source'
Traceback (most recent call last):
File "/Users/droumis/src/lumen/lumen/dashboard.py", line 362, in _render_dashboard
self._materialize_specification()
File "/Users/droumis/src/lumen/lumen/dashboard.py", line 407, in _materialize_specification
state.load_pipelines(auto_update=self.config.auto_update)
File "/Users/droumis/src/lumen/lumen/state.py", line 161, in load_pipelines
for name, source_spec in self.spec.get('pipelines', {}).items()
File "/Users/droumis/src/lumen/lumen/state.py", line 161, in <dictcomp>
for name, source_spec in self.spec.get('pipelines', {}).items()
File "/Users/droumis/src/lumen/lumen/pipeline.py", line 221, in from_spec
source = spec['source']
KeyError: 'source'
It seems that the transform on the pipeline branch is not being applied to the view that references it. In the image below, I would expect to see only the selected columns (['species', 'island', 'bill_length_mm', 'bill_depth_mm']) in the second table. This is using the same example script from the original comment above.
This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.