Unexpected thread state when solving des
Opened this issue · 5 comments
bennahugo commented
376.37 25.0Gb gainopts(StefCal.py:905:get_result): checking flagging
376.40 25.0Gb gainopts(StefCal.py:918:get_result): 0.00% (0/10654800) data points were flagged in the stefcal process. Can take.
376.40 25.0Gb gainopts(StefCal.py:1009:get_result): computing result
378.32 25.7Gb gainopts(StefCal.py:1081:get_result): computing result: done
378.32 25.7Gb gainopts(StefCal.py:1117:get_result): ev.0.0.0.2.1 elapsed time 0m4.25s
terminate called after throwing an instance of 'LOFAR::Exception'
what(): unexpected thread state in getWorkOrder()
Traceback (most recent call last):
File "/usr/bin/meqtree-pipeliner.py", line 176, in <module>
res = func(mqs,None,wait=True);
File "/usr/lib/python2.7/dist-packages/Cattery/Calico/calico-stefcal.py", line 381, in _run_stefcal
mqs.execute('VisDataMux',mssel.create_io_request(),wait=wait);
File "/usr/lib/python2.7/dist-packages/Timba/Apps/meqserver.py", line 173, in execute
return self.meq('Node.Execute',rec,wait=wait);
File "/usr/lib/python2.7/dist-packages/Timba/Apps/meqserver.py", line 126, in meq
msg = self.await(replyname,resume=True,timeout=wait);
File "/usr/lib/python2.7/dist-packages/Timba/Apps/multiapp_proxy.py", line 524, in await
raise RuntimeError,"lost all connections while waiting for event "+str(what);
RuntimeError: lost all connections while waiting for event Result.Node.execute.1
/home/hugo/output/COMBINED.J1638.2-6420.1GC-J1638.2-6420.diffgain.cp does not exist, so not trying to remove
Parset options:
244 def decalibrate(incol="SUBTRACTED_DATA",
245 calincol="CORRECTED_DATA",
246 outcol="SUBTRACTED_DATA",
247 model="MODEL_WITHOUT_DES",
248 lsmfilepostfix="decal1",
249 des="{0:s}-catalog.lsm.html.de_tagged.lsm.html",
250 label='decal',
251 freq_int=[16, 64],
252 masksig=[45, 45, 45],
253 solvemode='Gain2x2',
254 corrtype='CORR_DATA_SUB',#'sr',
255 interval=[40, 80, 80],
256 restore=None):
....
284 recipe.add("cab/calibrator", "calibrate_target_%d" % ti, {
285 'msname': "%s.%s.1GC.ms" % (PREFIX, t),
286 'column': calincol,
287 'tile-size': 120,
288 'make-plots': True,
289 'skymodel': "{0:s}:output".format(des).format(f),
290 ##'model-column': 'MODEL_WITHOUT_DES',
291 'Ejones': True,
292 'beam-files-pattern': "MeerKAT_VBeam_10MHz_53Chans_$(xy)_$(reim).fits",
293 'beam-l-axis' : "X",
294 'beam-m-axis' : "Y",
295 'parallactic-angle-rotation': True,
296 'write-flagset': "cubical",
297 'read-legacy-flags': True,
298 'fill-legacy-flags': False,
299 'save-config': "{0:s}.tdl".format(t),
300 'label': t,
301 'prefix': t,
302 'make-plots': True,
303 'output-data': corrtype,
304 'output-column': outcol,
305 'DDjones': True,
306 'DDjones-tag': 'dE',
307 'DDjones-solution-intervals': [interval[ti], freq_int[ti]],
308 'DDjones-smoothing-intervals': [interval[ti] * 5, freq_int[ti] * 5],
309 'DDjones-matrix-type': solvemode,
310 'DDjones-niter': 1000,
311 'DDjones-chisq-clipping': True,
312 'threads': 64,
313 'DDjones-ampl-clipping': True,
314 'DDjones-ampl-clipping-high': 1.2,
315 'DDjones-ampl-clipping-low': 0,
316 'DDjones-niter': 1000,
317 'save-config': "{0:s}.tdl".format(t)
bennahugo commented
@o-smirnov any ideas?
bennahugo commented
I think some tiles are already fully flagged
o-smirnov commented
Google just shows a bunch of shoes. Probably means we're hosed?
[image: image.png]
…On Thu, Apr 11, 2019 at 11:59 AM Benjamin Hugo ***@***.***> wrote:
@o-smirnov <https://github.com/o-smirnov> any ideas?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#49 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AGK5vzCm3NLu1AyRDkBaOF54uTjscj1-ks5vfwd2gaJpZM4cpNjY>
.
o-smirnov commented
Yeah I'm not sure it handles that gracefully. Maybe rerun with bigger tiles?
…On Thu, Apr 11, 2019 at 12:02 PM Benjamin Hugo ***@***.***> wrote:
I think some tiles are already fully flagged
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#49 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AGK5v26mSyKUVydvezQupOMaYuUkTPRCks5vfwhGgaJpZM4cpNjY>
.
bennahugo commented
Hmm no it looks like something more insidious:
Running in serial it works...
956.77 6.1Gb gainopts(GainOpts.py:295:resolve_tilings): based on an LCM tiling of [31, 1366]
956.77 6.1Gb gainopts(GainOpts.py:287:resolve_tilings): datashape (31, 1366) expanded datashape is (31, 1366)
956.77 6.1Gb gainopts(GainOpts.py:295:resolve_tilings): based on an LCM tiling of [31, 1366]
956.78 6.1Gb gainopts(GainOpts.py:287:resolve_tilings): datashape (31, 1366) expanded datashape is (31, 1366)
956.78 6.1Gb gainopts(GainOpts.py:295:resolve_tilings): based on an LCM tiling of [31, 1366]
956.78 6.1Gb gainopts(GainOpts.py:287:resolve_tilings): datashape (31, 1366) expanded datashape is (31, 1366)
956.78 6.1Gb gainopts(GainOpts.py:295:resolve_tilings): based on an LCM tiling of [31, 1366]
956.79 6.1Gb gainopts(GainOpts.py:287:resolve_tilings): datashape (31, 1366) expanded datashape is (31, 1366)
956.79 6.1Gb gainopts(GainOpts.py:295:resolve_tilings): based on an LCM tiling of [31, 1366]
956.80 6.1Gb gainopts(GainOpts.py:287:resolve_tilings): datashape (31, 1366) expanded datashape is (31, 1366)
956.80 6.1Gb gainopts(GainOpts.py:295:resolve_tilings): based on an LCM tiling of [31, 1366]
956.80 6.1Gb gainopts(GainOpts.py:287:resolve_tilings): datashape (31, 1366) expanded datashape is (31, 1366)
956.80 6.1Gb gainopts(GainOpts.py:295:resolve_tilings): based on an LCM tiling of [31, 1366]
956.81 6.1Gb gainopts(GainOpts.py:287:resolve_tilings): datashape (31, 1366) expanded datashape is (31, 1366)
956.81 6.1Gb gainopts(GainOpts.py:295:resolve_tilings): based on an LCM tiling of [31, 1366]
956.82 6.1Gb gainopts(StefCal.py:484:get_result): constructed internal arrays, trying to release array memory
956.84 6.1Gb gainopts(StefCal.py:487:get_result): released memory
956.84 6.1Gb gainopts(StefCal.py:492:get_result): no valid data found for solvable IFRs -- nothing to stefcal!
/home/hugo/output/COMBINED.J1638.2-6420.1GC-J1638.2-6420.diffgain.cp does not exist, so not trying to remove
### Job result: None
### No more commands
### Stopping the meqserver
### All your batch are belong to us. Bye!