MoneroOcean/xmrig

Segmentation faults on 2 ARM devices on Algo switch

Koesters opened this issue · 6 comments

I have 4 ARM systems 1 Rock64 2 NanoPC-T4 1 Odroid N2+

On the Odroid and 1 NanoPC-T4 algo switching seems unstable. The other two work fine. They work for weeks on xmrig vanilla and I tested them since yesterday with vanilla again and 0 problems. Temperatures are OK. The N2+ is at 30 degrees actively cooled on performance monitor. The failing Nano is at 70 passively cooled on interactive monitor.

I did not edit the config beside User and Rig-ID, then the test etc.

Looks to me that this combo seems to fail
gcc/9.3.0 LIBS libuv/1.34.2 OpenSSL/1.1.1f hwloc/2.5.0a1-git

As it's ARM i need to self compile.
The 2 that do not work where compiled later.

I also noticed they failed within the same second, even milliseconds apart.
This seems unlikely given they are different hardware that the problem is internal.
3 ARM systems are on the same switch 1 the rock has no problems.

All my Intels (including Intel MAC) and AMD's don't have issues. I have 13 workers.

NanoPC-T4

  • ABOUT XMRig/6.11.2-mo1 gcc/9.3.0
  • LIBS libuv/1.34.2 OpenSSL/1.1.1f hwloc/2.5.0a1-git
  • HUGE PAGES supported
  • 1GB PAGES unavailable
  • CPU ARM Cortex-A53 (2) 64-bit AES
    L2:0.0 MB L3:0.0 MB 6C/6T NUMA:1
  • MEMORY 3.0/3.7 GB (81%)
  • DONATE 0%
  • POOL #1 gulf.moneroocean.stream:10008 algo auto
  • COMMANDS hashrate, pause, resume, results, connection
  • OPENCL disabled
  • CUDA disabled

snip ...
[2021-04-15 03:32:37.643] cpu use argon2 implementation default
[2021-04-15 03:32:37.834] cpu stopped (191 ms)
[2021-04-15 03:32:37.835] randomx init dataset algo panthera (6 threads) seed 7d60fe93dad74a5e...
[2021-04-15 03:32:38.572] randomx allocated 2336 MB (2080+256) huge pages 100% 1168/1168 +JIT (737 ms)
[2021-04-15 03:32:39.494] randomx dataset ready (923 ms)
[2021-04-15 03:32:39.494] cpu use profile panthera (6 threads) scratchpad 256 KB
[2021-04-15 03:32:39.496] cpu READY threads 6/6 (6) huge pages 100% 6/6 memory 1536 KB (2 ms)
[2021-04-15 03:33:29.923] miner speed 10s/60s/15m 248.6 n/a n/a H/s max 249.7 H/s
[2021-04-15 03:33:40.057] cpu accepted (236/0) diff 20842 (1492 ms)
[2021-04-15 03:34:29.969] miner speed 10s/60s/15m 248.6 248.7 n/a H/s max 249.7 H/s
[2021-04-15 03:35:30.018] miner speed 10s/60s/15m 249.0 248.8 n/a H/s max 249.8 H/s
[2021-04-15 03:35:47.662] net new job from gulf.moneroocean.stream:10008 diff 19421 algo rx/0 height 2339546
[2021-04-15 03:35:47.684] cpu stopped (22 ms)
[2021-04-15 03:35:47.684] randomx init dataset algo rx/0 (6 threads) seed aef2d93d89bcfbe1...
Segmentation fault

ODROID

  • ABOUT XMRig/6.11.2-mo1 gcc/9.3.0
  • LIBS libuv/1.34.2 OpenSSL/1.1.1f hwloc/2.5.0a1-git
  • HUGE PAGES supported
  • 1GB PAGES unavailable
  • CPU ARM Cortex-A53 (2) 64-bit AES
    L2:0.0 MB L3:0.0 MB 6C/6T NUMA:1
  • MEMORY 3.3/3.6 GB (91%)
  • DONATE 0%
  • POOL #1 gulf.moneroocean.stream:10008 algo auto
  • COMMANDS hashrate, pause, resume, results, connection
  • OPENCL disabled
  • CUDA disabled

[2021-04-15 03:32:37.638] cpu use argon2 implementation default
[2021-04-15 03:32:37.749] cpu stopped (111 ms)
[2021-04-15 03:32:37.750] randomx init dataset algo panthera (6 threads) seed 7d60fe93dad74a5e...
[2021-04-15 03:32:38.026] randomx allocated 2336 MB (2080+256) huge pages 100% 1168/1168 +JIT (277 ms)
[2021-04-15 03:32:38.837] randomx dataset ready (810 ms)
[2021-04-15 03:32:38.837] cpu use profile panthera (6 threads) scratchpad 256 KB
[2021-04-15 03:32:38.838] cpu READY threads 6/6 (6) huge pages 100% 6/6 memory 1536 KB (2 ms)
[2021-04-15 03:32:48.403] miner speed 10s/60s/15m n/a n/a n/a H/s max n/a H/s
[2021-04-15 03:32:50.442] cpu accepted (514/0) diff 25084 (1596 ms)
[2021-04-15 03:33:48.450] miner speed 10s/60s/15m 460.2 459.7 n/a H/s max 460.2 H/s
[2021-04-15 03:34:48.483] miner speed 10s/60s/15m 459.4 459.5 n/a H/s max 460.7 H/s
[2021-04-15 03:34:58.378] cpu accepted (515/0) diff 25084 (1026 ms)
[2021-04-15 03:35:00.970] cpu accepted (516/0) diff 25084 (1026 ms)
[2021-04-15 03:35:12.699] cpu accepted (517/0) diff 25084 (1027 ms)
[2021-04-15 03:35:47.644] net new job from gulf.moneroocean.stream:10008 diff 208029 algo rx/arq height 668914
[2021-04-15 03:35:47.660] cpu stopped (16 ms)
[2021-04-15 03:35:47.660] randomx init dataset algo rx/arq (6 threads) seed 128d925cda4fcdaf...
Segmentation fault

Works on;

NANOPC 2

  • ABOUT XMRig/6.11.2-mo1 gcc/10.1.0
  • LIBS libuv/1.18.0 OpenSSL/1.1.1 hwloc/2.5.0a1-git
  • HUGE PAGES supported
  • 1GB PAGES unavailable
  • CPU ARM Cortex-A53 (2) 64-bit AES
    L2:0.0 MB L3:0.0 MB 6C/6T NUMA:1
  • MEMORY 3.0/3.8 GB (81%)
  • DONATE 0%
  • POOL #1 gulf.moneroocean.stream:10008 algo auto
  • COMMANDS hashrate, pause, resume, results, connection
  • OPENCL disabled
  • CUDA disabled

Rock 64

  • ABOUT XMRig/6.11.2-mo1 gcc/8.4.0
  • LIBS libuv/1.8.0 OpenSSL/1.0.2g hwloc/1.11.2
  • HUGE PAGES supported
  • 1GB PAGES unavailable
  • CPU ARM Cortex-A53 (1) 64-bit AES
    L2:0.0 MB L3:0.0 MB 4C/4T NUMA:1
  • MEMORY 3.2/3.8 GB (83%)
  • DONATE 0%
  • POOL #1 gulf.moneroocean.stream:10002 algo auto
  • COMMANDS hashrate, pause, resume, results, connection
  • OPENCL disabled
  • CUDA disabled

Those are all some super-oddball and untested versions of libs.

Build the versions of deps that are included, using the provided ./scripts/build_deps.sh which should result in:

 * LIBS         libuv/1.41.0 OpenSSL/1.1.1j hwloc/2.4.1

And we can debug from there. Unknown lib versions can't be a basis for any science about the problem. I would currently suspect mostly libuv being too old, or not old enough (since 1.8.x or 1.18.x are on ones that switch okay, but the broken ones are both 1.34.x which could be a buggy notch).

Now it runs stable on both since days. One gave me still trouble until a hwloc and libhwloc-dev (apt) update, deletion of all cmake files and a recompile.

  • ABOUT XMRig/6.11.2-mo1 gcc/9.3.0
  • LIBS libuv/1.34.2 OpenSSL/1.1.1f hwloc/2.5.0a1-git

Still showing this, I'll keep in mind your comments and the build_deps.sh for any new and rebuilds.

It might be due to a xmrig-proxy rebuild as all ARM systems are behind one proxy on a different odroid an XU4.
xss@odroid:~/mine/moneroocean/xmrig-proxy/build# ./xmrig-proxy

  • ABOUT xmrig-proxy/6.12.0-mo1 gcc/9.3.0
  • LIBS libuv/1.41.1-dev OpenSSL/1.0.2g
  • MODE nicehash
  • POOL #1 gulf.moneroocean.stream:11024 algo auto

Strange the proxy is using ancient OpenSSL while the miners are using more current ones.

Strange the proxy is using ancient OpenSSL while the miners are using more current ones.

The XU4 is a 5 years old 32 bit system with specially made Linux. As such the depos are outdated. At best I think I could update to 18.04.
https://wiki.odroid.com/odroid-xu4/os_images/linux/ubuntu_4.14/ubuntu_4.14

I have more such dead hardware. Nvidia Jetson TK1 which essentially stops at 14.04. But they can at least do smaller tasks.

The next would be the Rock64 with a special Ubuntu as well.

The most modern with 20.04 etc are the NanoPc-T4 and the N2+.

Interesting collection... I have a box full of various old stuff but mostly MIPS32 router/SBC things which are even less useful, even if given fresher software builds but 32-bit is very abandoned now. At least the proxy works at all! :)

Interesting collection... I have a box full of various old stuff but mostly MIPS32 router/SBC things which are even less useful, even if given fresher software builds but 32-bit is very abandoned now. At least the proxy works at all! :)

It also ran an AIS multiplexer and AIS decoder for years in a room with high temperature variations due to the necessity to be relatively close to the antennas, as it also has various non SDR, real AIS receivers attached.