[BUG] Issues running the batch_to_eopatch pipeline
Closed this issue · 5 comments
Question
I have successfully run the batch download pipeline and would like to convert the batch tiles to eopatches. After locally fixing #12 I've managed to run the batch_to_eopatch
pipeline, but I get the following exception in the logs:
Summary of exceptions
LoadUserDataTask (LoadUserDataTask-29825b248e7b11ecbc3b-f57730fc0853):
14 times:
TypeError: execute() missing 1 required positional argument: 'eopatch'
Which is weird, because the LoadUserDataTask
is the first Task and no eopatch
arguments should be expected.
Here is my config:
{
"pipeline": "eogrow.pipelines.batch_to_eopatch.BatchToEOPatchPipeline",
"folder_key": "data",
"mapping": [
{"batch_files": ["B01.tif"], "feature_type": "data", "feature_name": "B01", "multiply_factor": 1e-4},
{"batch_files": ["B02.tif"], "feature_type": "data", "feature_name": "B02", "multiply_factor": 1e-4},
{"batch_files": ["B03.tif"], "feature_type": "data", "feature_name": "B03", "multiply_factor": 1e-4},
{"batch_files": ["B04.tif"], "feature_type": "data", "feature_name": "B04", "multiply_factor": 1e-4},
{"batch_files": ["B05.tif"], "feature_type": "data", "feature_name": "B05", "multiply_factor": 1e-4},
{"batch_files": ["B06.tif"], "feature_type": "data", "feature_name": "B06", "multiply_factor": 1e-4},
{"batch_files": ["B07.tif"], "feature_type": "data", "feature_name": "B07", "multiply_factor": 1e-4},
{"batch_files": ["B08.tif"], "feature_type": "data", "feature_name": "B08", "multiply_factor": 1e-4},
{"batch_files": ["B8A.tif"], "feature_type": "data", "feature_name": "B8A", "multiply_factor": 1e-4},
{"batch_files": ["B09.tif"], "feature_type": "data", "feature_name": "B09", "multiply_factor": 1e-4},
{"batch_files": ["B10.tif"], "feature_type": "data", "feature_name": "B10", "multiply_factor": 1e-4},
{"batch_files": ["B11.tif"], "feature_type": "data", "feature_name": "B11", "multiply_factor": 1e-4},
{"batch_files": ["B12.tif"], "feature_type": "data", "feature_name": "B12", "multiply_factor": 1e-4},
{"batch_files": ["CLP.tif"], "feature_type": "data", "feature_name": "CLP", "multiply_factor": 0.00392156862745098},
{"batch_files": ["CLM.tif"], "feature_type": "mask", "feature_name": "CLM"},
{"batch_files": ["dataMask.tif"], "feature_type": "mask", "feature_name": "dataMask"}
],
"userdata_feature_name": "BATCH_INFO",
"userdata_timestamp_reader": "eogrow.utils.batch.read_timestamps_from_orbits",
"**global_settings": "${config_path}/sentinel2_l1c_batch_config.json"
}
Let me know if you need to see what sentinel2_l1c_batch_config.json
looks like.
Ah, the eopatch is Optional[EOPatch]
but apparently we forgot to add a default value.
Thanks for the hint. I tried setting the default to None
and got a new error:
❯ eogrow 01_batch_to_eopatch.json
INFO eogrow.core.pipeline:216: Running BatchToEOPatchPipeline
INFO eogrow.core.area.base:176: Loading grid from cache/grid_test_area_BatchAreaManager_0.2_0.004_1_10.0_0.gpkg
INFO eogrow.core.pipeline:159: Searching for Ray cluster
INFO eogrow.core.pipeline:164: No cluster found, pipeline will not use Ray.
INFO eogrow.core.pipeline:174: Starting EOExecutor for 14 EOPatches
0%| | 0/14 [00:00<?, ?it/s]Warning 1: TIFFReadDirectory:Sum of Photometric type-related color channels and ExtraSamples doesn't match SamplesPerPixel. Defining non-color channels as ExtraSamples.
Warning 1: TIFFReadDirectory:Sum of Photometric type-related color channels and ExtraSamples doesn't match SamplesPerPixel. Defining non-color channels as ExtraSamples.
Warning 1: TIFFReadDirectory:Sum of Photometric type-related color channels and ExtraSamples doesn't match SamplesPerPixel. Defining non-color channels as ExtraSamples.
0%| | 0/14 [00:04<?, ?it/s]
Traceback (most recent call last):
File "/Users/mlubej/.pyenv/versions/surs/bin/eogrow", line 33, in <module>
sys.exit(load_entry_point('eo-grow', 'console_scripts', 'eogrow')())
File "/Users/mlubej/.pyenv/versions/3.8.7/envs/surs/lib/python3.8/site-packages/click/core.py", line 1128, in __call__
return self.main(*args, **kwargs)
File "/Users/mlubej/.pyenv/versions/3.8.7/envs/surs/lib/python3.8/site-packages/click/core.py", line 1053, in main
rv = self.invoke(ctx)
File "/Users/mlubej/.pyenv/versions/3.8.7/envs/surs/lib/python3.8/site-packages/click/core.py", line 1395, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/Users/mlubej/.pyenv/versions/3.8.7/envs/surs/lib/python3.8/site-packages/click/core.py", line 754, in invoke
return __callback(*args, **kwargs)
File "/Users/mlubej/work/projects/sh-project/eo-grow/eogrow/cli.py", line 80, in main
pipeline.run()
File "/Users/mlubej/work/projects/sh-project/eo-grow/eogrow/core/pipeline.py", line 220, in run
finished, failed = self.run_procedure()
File "/Users/mlubej/work/projects/sh-project/eo-grow/eogrow/core/pipeline.py", line 263, in run_procedure
finished, failed, _ = self.run_execution(workflow, exec_args)
File "/Users/mlubej/work/projects/sh-project/eo-grow/eogrow/core/pipeline.py", line 185, in run_execution
execution_results = executor.run(**executor_run_params)
File "/Users/mlubej/work/projects/sh-project/eo-learn/core/eolearn/core/eoexecution.py", line 187, in run
full_execution_results = self._run_execution(processing_args, workers, processing_type)
File "/Users/mlubej/work/projects/sh-project/eo-learn/core/eolearn/core/eoexecution.py", line 219, in _run_execution
return submit_and_monitor_execution(process_executor, self._execute_workflow, processing_args)
File "/Users/mlubej/work/projects/sh-project/eo-learn/core/eolearn/core/eoexecution.py", line 398, in submit_and_monitor_execution
results[future_order[future]] = future.result()
File "/Users/mlubej/.pyenv/versions/3.8.7/lib/python3.8/concurrent/futures/_base.py", line 432, in result
return self.__get_result()
File "/Users/mlubej/.pyenv/versions/3.8.7/lib/python3.8/concurrent/futures/_base.py", line 388, in __get_result
raise self._exception
concurrent.futures.process.BrokenProcessPool: A process in the process pool was terminated abruptly while the future was running or pending.
Error in atexit._run_exitfuncs:
Traceback (most recent call last):
File "/Users/mlubej/.pyenv/versions/3.8.7/lib/python3.8/logging/__init__.py", line 2123, in shutdown
h.close()
File "/Users/mlubej/work/projects/sh-project/eo-grow/eogrow/core/logging.py", line 253, in close
self.local_file.close()
File "/Users/mlubej/work/projects/sh-project/eo-grow/eogrow/utils/fs.py", line 90, in close
self.copy_to_remote()
File "/Users/mlubej/work/projects/sh-project/eo-grow/eogrow/utils/fs.py", line 103, in copy_to_remote
fs.copy.copy_file(self._filesystem, self._local_path, self._remote_filesystem, self._remote_path)
File "/Users/mlubej/.pyenv/versions/3.8.7/envs/surs/lib/python3.8/site-packages/fs/copy.py", line 142, in copy_file
copy_file_if(
File "/Users/mlubej/.pyenv/versions/3.8.7/envs/surs/lib/python3.8/site-packages/fs/copy.py", line 221, in copy_file_if
copy_file_internal(
File "/Users/mlubej/.pyenv/versions/3.8.7/envs/surs/lib/python3.8/site-packages/fs/copy.py", line 277, in copy_file_internal
_copy_locked()
File "/Users/mlubej/.pyenv/versions/3.8.7/envs/surs/lib/python3.8/site-packages/fs/copy.py", line 270, in _copy_locked
dst_fs.upload(dst_path, read_file)
File "/Users/mlubej/.pyenv/versions/3.8.7/envs/surs/lib/python3.8/site-packages/fs_s3fs/_s3fs.py", line 774, in upload
self.client.upload_fileobj(
File "/Users/mlubej/.pyenv/versions/3.8.7/envs/surs/lib/python3.8/site-packages/boto3/s3/inject.py", line 537, in upload_fileobj
future = manager.upload(
File "/Users/mlubej/.pyenv/versions/3.8.7/envs/surs/lib/python3.8/site-packages/s3transfer/manager.py", line 329, in upload
return self._submit_transfer(
File "/Users/mlubej/.pyenv/versions/3.8.7/envs/surs/lib/python3.8/site-packages/s3transfer/manager.py", line 524, in _submit_transfer
self._submission_executor.submit(
File "/Users/mlubej/.pyenv/versions/3.8.7/envs/surs/lib/python3.8/site-packages/s3transfer/futures.py", line 474, in submit
future = ExecutorFuture(self._executor.submit(task))
File "/Users/mlubej/.pyenv/versions/3.8.7/lib/python3.8/concurrent/futures/thread.py", line 181, in submit
raise RuntimeError('cannot schedule new futures after '
RuntimeError: cannot schedule new futures after interpreter shutdown
We discovered that the issue is not in multithreading but instead lies in reading tiffs with ImportFromTiffTask
. Investigating further.