Triggering authenticated workflows
DiamondJoseph opened this issue · 2 comments
Assuming:
- There is a service called blueapi, which requires an authenticated request and creates "raw data"
- There is a process called analysis, which generically consumes "raw data" and creates "processed data"
- A user makes a request to blueapi to create "raw data" and knows they want a specific form of analysis to produce "processed data" either while blueapi is acting or afterwards.
- To leverage the workflow system, the user should not need to manually create the analysis instance
- The analysis instance should write data to the same visit as the raw data and request that spawned it
- The analysis should be authorized to read only the raw data that it requires
sequenceDiagram
actor Alice
Note left of Alice: my_scan uses my_analysis
Alice ->> +blueapi: run my_scan, visit=a1
Note over Alice,blueapi: scope read data visit=a1
Note over Alice,blueapi: scope write data visit=a1
Note over Alice,blueapi: scope run my_analysis visit=a1
participant raw as Raw Data Store<br>[via DataAPI]
blueapi ->> raw: StartDocument runid=a1-1
Note over blueapi,raw: AuthZ'd to write
participant manager as Workflow Manager
blueapi ->> manager: start my_analysis visit=a1 runid=a1-1
Note over blueapi,manager: AuthZ'd to run
create participant Analysis as my_analysis
manager ->> +Analysis: creates
Note over manager,Analysis: scope read data visit=a1
Note over manager,Analysis: scope write data visit=a1
opt Live Analysis
Analysis ->> raw: fetch data so far
raw ->> Analysis:
Note over Analysis,raw: AuthZ'd to read
Analysis ->> processed: processed data
Note over Analysis,processed: AuthZ'd to write
loop until scan over
blueapi ->> raw: Documents
Analysis -->> raw: poll for new data
Analysis ->> processed: processed data
end
blueapi ->> raw: StopDocument
Analysis -->> raw: poll for new data
Analysis ->> processed: processed data
end
opt Post Processing
blueapi ->> raw: Documents
blueapi ->> -raw: StopDocument
Analysis ->> raw: fetch all data
raw ->> Analysis:
Note over raw,Analysis: AuthZ'd to read
end
deactivate Analysis
participant processed as Processed Data Store<br>[via DataAPI]
destroy Analysis
Analysis ->> processed: processed data
Note over Analysis,processed: AuthZ'd to write
Alice ->> raw:
raw ->> Alice:
Note over Alice,raw: AuthZ'd to read
Alice ->> processed:
processed ->> Alice:
Note over Alice,processed: AuthZ'd to read
@callumforrester has thoughts about whether the live/at-rest processing should look the same or not:
DISCLAIMER: I'm not in data analysis and my knowledge may be out of date
this looks the same regardless of if it's post or live analysis.
Not quite, for post processing the code can and should be considerably simpler, no need to go through the data as if it is being streamed when it isn't, you want the code to say something like:
detector_data = data_api.get("saxs")[:]
return np.average(detector_data, axis=0)
This is especially important because (I believe) that most of our use cases are still for post processing rather than live processing, so we shouldn't introduce unnecessary complexity to the majority use case.
See bluesky/tiled#437 for Tiled implementation of the DataAPI informing the client that more information is available for consumption