natcap/taskgraph

logging from is_precalculated can be misleading at certain levels

davemfish opened this issue · 0 comments

I got tripped up by this when doing repeated invest runs in the same workspace but with a varying "results_suffix". In other words, repeated runs in these conditions:

Tasks are added with the hash_algorithm='md5' and copy_duplicate_artifact=True options. The target_path_list filenames have changed, but the hash is pre-existing and so the task should not have to re-calculate.

The logs look like this:

2020-07-06 19:57:23,090 (11567) WARNING taskgraph.Task [is_precalculated:1319] not precalculated (calculate geomorphology exposure (17)), Task hash exists, but there are these mismatches: Recorded path not in target path list /home/dmf/projects/natcap/batch_cv/test/intermediate/geomorphology/geomorph_ProtectMangrove_BZ.pickle
2020-07-06 19:57:23,090 (11568) DEBUG taskgraph.Task [_call:1111] not precalculated calculate geomorphology exposure (17)
2020-07-06 19:57:23,090 (11568) DEBUG taskgraph.Task [_execute_sqlite:1617] file:///home/dmf/projects/natcap/batch_cv/test/intermediate/_taskgraph_working_dir/taskgraph_data.db?mode=ro exists: True
2020-07-06 19:57:23,091 (11568) DEBUG taskgraph.Task [_call:1128] duplicate artifact db results: [('/home/dmf/projects/natcap/batch_cv/test/intermediate/geomorphology/geomorph_ProtectMangrove_BZ.pickle', 'md5', '508691ac1402ccfd810585dc68a08f17')]
2020-07-06 19:57:23,091 (11569) DEBUG taskgraph.Task [_call:1141] copying stored artifacts to target path list. 
	stored artifacts: ['/home/dmf/projects/natcap/batch_cv/test/intermediate/geomorphology/geomorph_ProtectMangrove_BZ.pickle']
	target_path_list: ['/home/dmf/projects/natcap/batch_cv/test/intermediate/geomorphology/geomorph_ProtectCoral_BZ.pickle']

2020-07-06 19:57:23,108 (11586) DEBUG taskgraph.Task [_call:1207] successful run on task calculate geomorphology exposure (17)

That's all great except for the fact that the first message is a WARNING and the rest are DEBUG. So if I'm filtering at INFO, for example, I only see the first message which suggests the task will be re-calculated, which is not what I'm expecting.

It's tricky since with different add_task options this would be an important warning, but in this case it feels like I'm being warned about what I consider correct usage. Not sure if it's a good idea to demote the level of that warning, promote the level of DEBUG taskgraph.Task [_call:1141] copying stored artifacts to target path list. or something else entirely.