mrpaulandrew/procfwk

Avoid pointless execution run after a CleanPreviousRun results in all pipelines having the status Success

pfiadeiro opened this issue · 3 comments

Use Case:

If the metadata of some pipelines, for some reason, isn't updated, when a new execution is triggered and the Clean Up Previous Run activity runs, the status of all pipelines not updated will be changed to Success if they completed successfully.

Once that activity completes, the Execution Wrapper activity will start a new execution that will be pointless since all pipelines have finished successfully and nothing will happen during this run, it will only serve to clean the CurrentExecution table at the end.

Suggestion:

Tweak the ExecutionWrapper stored procedure to consider this particular case.

One possible solution is after the check for a running execution is done, to add the following piece of code

--- check for all pipelines' status being success after a clean run
IF EXISTS
(
    select 1 from [procfwk].[CurrentExecution]
)
AND NOT EXISTS
(
    select 1 from [procfwk].[CurrentExecution] where [PipelineStatus] <> 'Success'
)
BEGIN
    EXEC [procfwk].[UpdateExecutionLog]
		@PerformErrorCheck = 0;
END;

@pfiadeiro

I've tested this situation and based on my understanding the ExecuteWrapper proc does handle this as part of the ultimate ELSE condition...

--no restart considerations, just create new execution
ELSE
	BEGIN
		EXEC [procfwk].[CreateNewExecution] 
			@CallingDataFactoryName = @CallingDataFactory
	END

What it doesn't do is move the successful worker records into the ExecutionLog table before doing the truncate. Which i'll fix.

However, I'm thinking about your title of the issue. Avoid pointless execution run. Are you suggesting that when triggered. If all workers are already successful the framework should just update logs and stop?

Is this a pointless old execution run? Or a just a new run that should have logs updated?

Cheers

@mrpaulandrew

You're right, for some reason I didn't notice that. It will create a new run, just miss the logs of the previous run but you can probably even use that bit of code I put there to archive the logs before starting the new run.

It's actually a good question. From my perspective it's an old execution run but I can see arguments for either option. Let me explain why:

  • Update logs and stop - if a worker isn't updated for some reason, the framework will raise an error when running the sp UpdateExecutionLog so we'll be aware that something went wrong. If I restart the framework, if the worker actually failed, the clean up will update its status, the run will be restarted (assuming an OverideRestart = 0) and it will eventually finish. Now, assume the worker we didn't know its status finished successfully, a new run will be created and finish successfully. Immediately, I won't really know if it was a new run that finished or the previous one (yes, I'm aware that timings will probably be different, execution ids, etc).

  • Archive log, create new run - to save time this is probably the best option. The previous one finished fine, just archive the logs and start a new one, not forcing the user to start the execution twice.

Do you see what I mean?

Edit: My pointless tag was pretty much aimed at a run that wouldn't do anything at all but I was wrong about that so this isn't really about something being pointless or not anymore, it's more about how to approach a run that finished successfully but that for some reason wasn't recorded as such.

Sure, yes, ok.
I'll update the execution wrapper.
Thanks