LSSTDESC/dia_pipe

Problem running imageDifference over multiple tracts

Closed this issue · 3 comments

Bruno found that some images were failing the imageDifference step when running over multiple tracts in DC2. Here is an example of the error message:

root INFO: Processing 1 targets with a pool of 4 processes...
imageDifferenceDriver INFO: nid00533:39861: Start processing {'visit': 400438, 'detector': 28, 'filter': 'g', 'raftName': 'R10', 'detectorName': 'S01'}
imageDifferenceDriver.imageDifference INFO: Processing {'visit': 400438, 'detector': 28, 'filter': 'g', 'raftName': 'R10', 'detectorName': 'S01'}
imageDifferenceDriver.imageDifference.getTemplate INFO: Central skyMap tract 4227
imageDifferenceDriver.imageDifference.getTemplate INFO: All overlapping skyMap tracts [4227, 4432]
imageDifferenceDriver.imageDifference.getTemplate INFO: Adding patch (1, 5) from tract 4227
imageDifferenceDriver.imageDifference.getTemplate INFO: Adding patch (1, 6) from tract 4227
imageDifferenceDriver.imageDifference.getTemplate INFO: Adding patch (2, 5) from tract 4227
imageDifferenceDriver.imageDifference.getTemplate INFO: Adding patch (2, 6) from tract 4227
imageDifferenceDriver.imageDifference.getTemplate INFO: Adding patch (4, 0) from tract 4432
imageDifferenceDriver.imageDifference.getTemplate INFO: Adding patch (5, 0) from tract 4432
imageDifferenceDriver.imageDifference.getTemplate INFO: Adding patch (6, 0) from tract 4432
imageDifferenceDriver WARN: Failed imageDifferenceDriver: 
  File "src/image/Image.cc", line 172, in void lsst::afw::image::ImageBase<PixelT>::assign(const lsst::afw::image::ImageBase<PixelT>&, const lsst::geom::Box2I&, lsst::afw::image::ImageOrigin) [with PixelT = float]
    Dimension mismatch: 5702x5705 v. 0x0 {0}
lsst::pex::exceptions::LengthError: 'Dimension mismatch: 5702x5705 v. 0x0'

imageDifferenceDriver INFO: nid00533:39861: Finished processing {'visit': 400438, 'detector': 28, 'filter': 'g', 'raftName': 'R10', 'detectorName': 'S01'}
imageDifferenceDriver WARN: Could not persist metadata for dataId={'visit': 400438, 'detector': 28, 'filter': 'g', 'raftName': 'R10', 'detectorName': 'S01'}: Cannot look up skymap key 'tract'; it must be explicitly included in the data ID
39862 INFO  2021-02-07T16:41:44.841-0800 root: Process stats for nid00533:39862: {'Name': 'python', 'Umask': '0002', 'State': 'R (running)', 'Tgid': '39862', 'Ngid': '0', 'Pid': '39862', 'PPid': '39859', 'TracerPid': '0', 'Uid': '80716\t80716\t80716\t80716', 'Gid': '80716\t80716\t80716\t80716', 'FDSize': '256', 'Groups': '56799 57177 80716', 'NStgid': '39862', 'NSpid': '39862', 'NSpgid': '39862', 'NSsid': '39862', 'VmPeak': '2127432 kB', 'VmSize': '2127432 kB', 'VmLck': '0 kB', 'VmPin': '0 kB', 'VmHWM': '377764 kB', 'VmRSS': '377764 kB', 'RssAnon': '205732 kB', 'RssFile': '169140 kB', 'RssShmem': '2892 kB', 'VmData': '260636 kB', 'VmStk': '212 kB', 'VmExe': '2284 kB', 'VmLib': '971012 kB', 'VmPTE': '3828 kB', 'VmPMD': '20 kB', 'VmSwap': '0 kB', 'HugetlbPages': '0 kB', 'HugetlbResvPages': '0 kB', 'Threads': '1', 'SigQ': '0/513037', 'SigPnd': '0000000000000000', 'ShdPnd': '0000000000000000', 'SigBlk': '0000000000000000', 'SigIgn': '0000000001101000', 'SigCgt': '00000001800006aa', 'CapInh': '0000000000000000', 'CapPrm': '0000000000000000', 'CapEff': '0000000000000000', 'CapBnd': '0000003fffffffff', 'CapAmb': '0000000000000000', 'NoNewPrivs': '0', 'Seccomp': '0', 'Speculation_Store_Bypass': 'thread vulnerable', 'Cpus_allowed': '0a000000,0a000000', 'Cpus_allowed_list': '25,27,57,59', 'Mems_allowed': '00000000,00000003', 'Mems_allowed_list': '0-1', 'voluntary_ctxt_switches': '77664', 'nonvoluntary_ctxt_switches': '11130'}
39863 INFO  2021-02-07T16:41:44.841-0800 root: Process stats for nid00533:39863: {'Name': 'python', 'Umask': '0002', 'State': 'R (running)', 'Tgid': '39863', 'Ngid': '0', 'Pid': '39863', 'PPid': '39859', 'TracerPid': '0', 'Uid': '80716\t80716\t80716\t80716', 'Gid': '80716\t80716\t80716\t80716', 'FDSize': '256', 'Groups': '56799 57177 80716', 'NStgid': '39863', 'NSpid': '39863', 'NSpgid': '39863', 'NSsid': '39863', 'VmPeak': '2127436 kB', 'VmSize': '2127436 kB', 'VmLck': '0 kB', 'VmPin': '0 kB', 'VmHWM': '377552 kB', 'VmRSS': '377552 kB', 'RssAnon': '205712 kB', 'RssFile': '168952 kB', 'RssShmem': '2888 kB', 'VmData': '260640 kB', 'VmStk': '212 kB', 'VmExe': '2284 kB', 'VmLib': '971012 kB', 'VmPTE': '3824 kB', 'VmPMD': '20 kB', 'VmSwap': '0 kB', 'HugetlbPages': '0 kB', 'HugetlbResvPages': '0 kB', 'Threads': '1', 'SigQ': '0/513037', 'SigPnd': '0000000000000000', 'ShdPnd': '0000000000000000', 'SigBlk': '0000000000000000', 'SigIgn': '0000000001101000', 'SigCgt': '00000001800006aa', 'CapInh': '0000000000000000', 'CapPrm': '0000000000000000', 'CapEff': '0000000000000000', 'CapBnd': '0000003fffffffff', 'CapAmb': '0000000000000000', 'NoNewPrivs': '0', 'Seccomp': '0', 'Speculation_Store_Bypass': 'thread vulnerable', 'Cpus_allowed': '0a000000,0a000000', 'Cpus_allowed_list': '25,27,57,59', 'Mems_allowed': '00000000,00000003', 'Mems_allowed_list': '0-1', 'voluntary_ctxt_switches': '80252', 'nonvoluntary_ctxt_switches': '1073'}
39864 INFO  2021-02-07T16:41:44.841-0800 root: Process stats for nid00533:39864: {'Name': 'python', 'Umask': '0002', 'State': 'R (running)', 'Tgid': '39864', 'Ngid': '0', 'Pid': '39864', 'PPid': '39859', 'TracerPid': '0', 'Uid': '80716\t80716\t80716\t80716', 'Gid': '80716\t80716\t80716\t80716', 'FDSize': '256', 'Groups': '56799 57177 80716', 'NStgid': '39864', 'NSpid': '39864', 'NSpgid': '39864', 'NSsid': '39864', 'VmPeak': '2127432 kB', 'VmSize': '2127432 kB', 'VmLck': '0 kB', 'VmPin': '0 kB', 'VmHWM': '377768 kB', 'VmRSS': '377768 kB', 'RssAnon': '205708 kB', 'RssFile': '169140 kB', 'RssShmem': '2920 kB', 'VmData': '260636 kB', 'VmStk': '212 kB', 'VmExe': '2284 kB', 'VmLib': '971012 kB', 'VmPTE': '3824 kB', 'VmPMD': '20 kB', 'VmSwap': '0 kB', 'HugetlbPages': '0 kB', 'HugetlbResvPages': '0 kB', 'Threads': '1', 'SigQ': '0/513037', 'SigPnd': '0000000000000000', 'ShdPnd': '0000000000000000', 'SigBlk': '0000000000000000', 'SigIgn': '0000000001101000', 'SigCgt': '00000001800006aa', 'CapInh': '0000000000000000', 'CapPrm': '0000000000000000', 'CapEff': '0000000000000000', 'CapBnd': '0000003fffffffff', 'CapAmb': '0000000000000000', 'NoNewPrivs': '0', 'Seccomp': '0', 'Speculation_Store_Bypass': 'thread vulnerable', 'Cpus_allowed': '0a000000,0a000000', 'Cpus_allowed_list': '25,27,57,59', 'Mems_allowed': '00000000,00000003', 'Mems_allowed_list': '0-1', 'voluntary_ctxt_switches': '81654', 'nonvoluntary_ctxt_switches': '2286'}
root INFO: Process stats for nid00533:39861: {'Name': 'python', 'Umask': '0002', 'State': 'R (running)', 'Tgid': '39861', 'Ngid': '0', 'Pid': '39861', 'PPid': '39859', 'TracerPid': '0', 'Uid': '80716\t80716\t80716\t80716', 'Gid': '80716\t80716\t80716\t80716', 'FDSize': '256', 'Groups': '56799 57177 80716', 'NStgid': '39861', 'NSpid': '39861', 'NSpgid': '39861', 'NSsid': '39861', 'VmPeak': '5653784 kB', 'VmSize': '2794388 kB', 'VmLck': '0 kB', 'VmPin': '0 kB', 'VmHWM': '3821416 kB', 'VmRSS': '962312 kB', 'RssAnon': '777976 kB', 'RssFile': '180656 kB', 'RssShmem': '3680 kB', 'VmData': '866984 kB', 'VmStk': '212 kB', 'VmExe': '2284 kB', 'VmLib': '1000184 kB', 'VmPTE': '5104 kB', 'VmPMD': '20 kB', 'VmSwap': '0 kB', 'HugetlbPages': '0 kB', 'HugetlbResvPages': '0 kB', 'Threads': '1', 'SigQ': '0/513037', 'SigPnd': '0000000000000000', 'ShdPnd': '0000000000000000', 'SigBlk': '0000000000000000', 'SigIgn': '0000000001101000', 'SigCgt': '00000001800006aa', 'CapInh': '0000000000000000', 'CapPrm': '0000000000000000', 'CapEff': '0000000000000000', 'CapBnd': '0000003fffffffff', 'CapAmb': '0000000000000000', 'NoNewPrivs': '0', 'Seccomp': '0', 'Speculation_Store_Bypass': 'thread vulnerable', 'Cpus_allowed': '0a000000,0a000000', 'Cpus_allowed_list': '25,27,57,59', 'Mems_allowed': '00000000,00000003', 'Mems_allowed_list': '0-1', 'voluntary_ctxt_switches': '278926', 'nonvoluntary_ctxt_switches': '455'}
Sun Feb  7 16:41:46 PST 2021
Done.

The problem here is that the initial guess of which patches overlap is only approximate. When a more precise calculation of done (including inner bounding boxes) in the code if there is no overlap it tries to create an image of size zero.

wmwv commented

So is this something to fix in getMultiTractTemplate?

Yes.