lofar-astron/DP3

"Illegal Instruction" error on Intel Xeon CPU E5-2670 with "sandybridge" architecture

Closed this issue · 4 comments

I am trying to switch my LOFAR work to a machine in Durham which runs Intel Xeon CPU E5-2670 with a "sandybridge" architecture. I was formally using the University of Hertfordshire machine which ran Xeons (Gold 6130) with a "haswell" architecture. The singularity images that I was using were built using MARCH='x86-64' and MTUNE='generic' which should allow for the image to work across most machines. However when running a simple instruction for NDPPP (see below) I get an "Illegal Instruction" error on the Durham machine while the same image and instruction works just fine at Hertfordshire.

Input Instruction

Singularity> NDPPP msin=L644205_SB000_uv.MS/ msout=L644205_SB000_uv_copy.MS/ steps=[]
Illegal instruction
Singularity> 

However, running NDPPP just to find the version is fine and returns the specified version in the singularity image build instructions. Also other LOFAR specific software such as losoto works fine along with other tools like msoverview. I would note that I think these other tools are largely written in Python.

NDPPP -v returns 5.3.1 as expected.

With this in mind I tried building singularity images specific to the Durham architecture (i.e with MARCH and MTUNE as 'sandybridge'). I also tried images with NOAVX512 set to true and false. However, with all of these images, I still get the same "Illegal Instruction" error.

Singularity build files - https://github.com/tikk3r/lofar-grid-hpccloud/tree/fedora - Build the base image and then final image with MARCH,MTUNE and NOAVX512 consistent.

I am not quite sure whether this is an issue with a fix in NDPPP or with how I'm building the singularity images with Durham CPUs in mind but I thought I would put this issue here for some advice. I really don't know what to do with such a simple error message so hopefully someone has an idea here. Let me know if you need anything else in this issue description.

Hi @kwpetley . Are you overriding (also) the default CMake settings of DP3? The cmake default of Dp3 is to add -march=native to the compiler, so if you don't disable that, Dp3 would be compiled (only) for the machine it is build on. The normal way to disable this is to add -DPORTABLE=True to cmake. This disables that flag. Did you try that?

Hi @aroffringa. Someone else has messaged me about this so I am currently remaking the image with -march='sandybridge' in the cmake command. I will try that and then also try the -DPORTABLE option if that doesn't work

Hi - Adding a -DTARGET_CPU=$MARCH to the cmake command within the singularity image fixed the issue! So I'm happy to close this issue here and make that change to the singularity recipe repository

Ok, thanks for the update!