caracal-pipeline/caracal

Error at the end of A&P calibration of selfcal (and at the beginning of DDcal)

Closed this issue · 7 comments

Hi,

I am performing amplitude and phase selfcalibration and all goes well until the very end where it reports this error on the log (and then few more follow...):

# INFO:    Converting SIF file to temporary sandbox...
# INFO:    Cleaning up image...
#   File "/stimela_mount/code/run.py", line 62, in <module>
#     subprocess.check_call(shlex.split(_runc))
#   File "/usr/lib/python2.7/subprocess.py", line 190, in check_call
#     raise CalledProcessError(retcode, cmd)
# subprocess.CalledProcessError: Command '['tigger-convert', '--append', '/stimela_mount/output/continuum/image_2/A1300_selfcal_A1300_2-pybdsm.lsm.html', '--force', '--append-type', 'auto', '--rename', '/stimela_mount/output/continuum/image_1/A1300_selfcal_A1300_1-pybdsm.lsm.html', '/stimela_mount/output/continuum/image_2/A1300_selfcal_A1300_final-pybdsm.lsm.html']' returned non-zero exit status 1
2023-07-13 00:00:37 CARACal.Stimela.create-final_lsm-1-2 ERROR: cd /local/work/lofaruser4/A1300/.stimela_workdir-16890584258721247 && singularity run --userns --workdir /local/work/lofaruser4/A1300/.stimela_workdir-16890584258721247 --containall returns error code 1

the final lines report

2023-07-13 00:00:37 CARACal ERROR: stimela.exceptions.PipelineException: Job 'create-final_lsm-1-2:: Combined models' failed: cd /local/work/lofaruser4/A1300/.stimela_workdir-16890584258721247 && singularity run --userns --workdir /local/work/lofaruser4/A1300/.stimela_workdir-16890584258721247 --containall returns error code 1
2023-07-13 00:00:37 CARACal INFO: exiting with error code 1

However, all the images are correctly produced as the .ms files. I have also performed few cycles of phase selfcal only before and all went without errors. A very similar error occurs in the early steps when running the ddcal on the data produced by these selfcal cycles.

I also attached the full log file.

Thanks,
Marco
log-caracal.txt

Hi @MarcoBalboni , sorry for the delayed reponse.

The problem is happening here:

restore_model:
  enable:                       True
  model:                        1+2
  clean_model:                  3

The current selected calibration mode is vis_only.
If you require to combine models using this option the mode that must be selected is pybdsm_only or pybdsm_vis.
The idea is that this will enable source finding in you image1 and image2 (being the one you'll be cleaning deeper) and combine the models.

In the current case, the image2 (A1300_selfcal_A1300_2 ) that you get is the results of the amplitude and phase self-calibration that you performed.
So just disable restore_model and it should complete successfully.

Hope this helps

Hi @Athanaseus thank you for the suggestion. Unfortunately, if I disable restore_model another error occurs, in particular regarding SOFIA:

stimela.exceptions.PipelineException: Job 'make-sofia_mask-field0-iter0:: Make SoFiA mask' failed: cd /local/work/lofaruser4/A1300/.stimela_workdir-16902486927050576 && singularity returns error code 1 2023-07-25 07:28:27 CARACal INFO: exiting with error code 1

What I am trying now is to re-run the whole thing specifying cal_model_mode = 'pybdsm_vis' .

Do you think it will be ok?

Thank you.

(here below the log file)

log-caracal.txt

Looking at the previous logs, it appears it still needed to clean up images when the error occurred.
Please check what products of SOFiA are available (Or SOFiA specific log in the logs dir).
And does the error persist when you re-run?

# 
# --- SoFiA 1.3.2: Removing unreliable sources ---------------------------------
#     Elapsed time: 00:04:25.99 h
# 
# Reloading data cube for parameterisation
# Loading cube /stimela_mount/input/continuum/image_0/A1300_selfcal_A1300_0-MFS-image.fits
# The data cube has been loaded.
# 
# --- SoFiA 1.3.2: Writing mask cube -------------------------------------------
#     Elapsed time: 00:04:26.65 h
# 
# 
# --- SoFiA 1.3.2: Adding WCS position to catalogue ----------------------------
#     Elapsed time: 00:04:26.90 h
# 
# WCS coordinates added to catalogue.
# 
# --- SoFiA 1.3.2: Writing output catalogue ------------------------------------
#     Elapsed time: 00:04:29.52 h
# 
# 
# --- SoFiA 1.3.2: Pipeline finished -------------------------------------------
#     Elapsed time: 00:04:29.52 h
# 
# INFO:    Cleaning up image...

The log also indicates that you are still using vis_only mode.
Yes, you can try running with pybdsm_vis also.
Check the documentation here: https://caracal.readthedocs.io/en/latest/manual/workers/selfcal/index.html#cal-model-mode

Yes, when I tried to rerun with disabled restore_model I left vis_only on purpose. However, even when I run it with pydsm_vis the latter error still occurs.
Where can I find the SOFiA products?
Here below attached the SOFiA specific log

log-selfcal__ap-make-sofia_mask-field0-iter0-20230725-112932.txt

Thank you for your help.

Marco

Hi @MarcoBalboni
The error is ambiguous and not directly coming from SOFiA, and I'm having trouble reproducing it on my end.
Can you update the version of stimela and see if it helps? (pip install stimela==1.7.6)
The SOFiA products (in this case, a mask) should be in the output/masking directory.

Hi,
the new error was probably related to the singularity that was badly initialized or something like that. Now I am trying to run the whole thing using the initial fix provided by you (without the restore_model).
I will keep you updated.

Thank you.

Marco

Hi @Athanaseus disabling the restore_model worked and the selfcal ended without problems and also the ddcal seems to work properly.
Thank you again for your help.

Marco