Issue with BenchmarkData causing pipelines to stall.

Question

Issue with BenchmarkData causing pipelines to stall.

Closed this issue 2 years ago · 1 comments

Hello,

When running one of our pipelines I occasionally encounter this error which causes the pipeline to stall, requiring manual restart:

'builtins.ValueError(Type names and field names must be valid identifiers: 'binding_inuse!SCcCCCCCCSCCCCCCCC')' raised in ...
Task = def getUtronIds(...):
Job = [utron_beds.dir/HCT-RDMSO-R01.star.novel_utrons.bed.gz -> utron_beds.dir/HCT-RDMSO-R01.star.novel_utrons.ids.gz]

Traceback (most recent call last):
File "/shared/sudlab1/General/projects/stem_utrons/envs/stem_utrons/lib/python3.7/site-packages/ruffus/task.py", line 713, in run_pooled_job_without_exceptions
register_cleanup, touch_files_only)
File "/shared/sudlab1/General/projects/stem_utrons/envs/stem_utrons/lib/python3.7/site-packages/ruffus/task.py", line 545, in job_wrapper_io_files
ret_val = user_defined_work_func(*params)
File "/shared/sudlab1/General/projects/stem_utrons/pipelines/pipeline_DTU/pipeline_DTU.py", line 454, in getUtronIds
P.run(statement)
File "/shared/sudlab1/General/projects/stem_utrons/envs/stem_utrons/lib/python3.7/site-packages/cgatcore/pipeline/execution.py", line 1230, in run
'BenchmarkData', sorted(benchmark_data[0]))
File "/shared/sudlab1/General/projects/stem_utrons/envs/stem_utrons/lib/python3.7/collections/init.py", line 361, in namedtuple
raise ValueError('Type names and field names must be valid '
ValueError: Type names and field names must be valid identifiers: 'binding_inuse!SCcCCCCCCSCCCCCCCC' \

Restarting the pipeline resolves the issue, and the statement seems to have run successfully and produced an output file, but it does stall the pipeline. Is there a way to 'ignore' the issue to prevent the pipeline stopping mid run?

Best wishes
Jack.

Answer 1 · 2021-10-17T17:16:11.000Z

Sorry for taking a while to get back to you. This is a strange one, it indicates that there is likely an issue with your configuration for sge. This would fit with the finding that it is an intermittent issue.

The error is generated after a job finishes and banchmarking data is returned.

im not an sge specialist but you could try identifying the issue using
qconf -suser [your user name]