StanfordLegion/legion

macOS: pthread_create exits with 35 when running `tutorial/01_tasks_and_futures`

Closed this issue · 5 comments

Hello! When I run REALM_BACKTRACE=1 ./tasks_and_futures 16 after DEBUG=1 CXXFLAGS="-g -O2" make in tutorial/01_tasks_and_futures, I get the following output:

Error output when running `./tasks_and_futures 16`
[0 - 1d89f9c40]    0.000100 {4}{threads}: reservation ('dedicated worker (generic) #1') cannot be satisfied
Computing the first 16 Fibonacci numbers...
Fibonacci(0) = 0 (elapsed = 0.00 s)
Fibonacci(1) = 1 (elapsed = 0.01 s)
Fibonacci(2) = 1 (elapsed = 0.04 s)
Fibonacci(3) = 2 (elapsed = 0.13 s)
PTHREAD: pthread_create(&thread, &attr, pthread_entry, this) = 35 (Resource temporarily unavailable)
Assertion failed: (0), function start_thread, file threads.cc, line 990.
Signal 6 received by node 0, process 90129 (thread 377a0b000) - obtaining backtrace
Signal 6 received by process 90129 (thread 377a0b000) at: stack trace: 29 frames
  [0] = 0   libsystem_platform.dylib            0x000000018215da23 _sigtramp + 55
  [1] = 0   libsystem_pthread.dylib             0x000000018212dcbf pthread_kill + 287
  [2] = 0   libsystem_c.dylib                   0x0000000182039a3f abort + 179
  [3] = 0   libsystem_c.dylib                   0x0000000182038d2f __assert_rtn + 283
  [4] = 0   tasks_and_futures                   0x0000000100fd2d4f _ZN5Realm12KernelThread12start_threadERKNS_22ThreadLaunchParametersERKNS_15CoreReservationE + 1203
  [5] = 0   tasks_and_futures                   0x0000000100fd3b43 _ZN5Realm6Thread28create_kernel_thread_untypedEPvPFvS1_ERKNS_22ThreadLaunchParametersERNS_15CoreReservationEPNS_15ThreadSchedulerE + 111
  [6] = 0   tasks_and_futures                   0x0000000101004123 _ZN5Realm6Thread20create_kernel_threadINS_21ThreadedTaskSchedulerEXadL_ZNS2_20scheduler_loop_wlockEvEEEEPS0_PT_RKNS_22ThreadLaunchParametersERNS_15CoreReservationEPNS_15ThreadSchedulerE + 55
  [7] = 0   tasks_and_futures                   0x0000000101004097 _ZN5Realm25KernelThreadTaskScheduler13worker_createEb + 67
  [8] = 0   tasks_and_futures                   0x000000010100145b _ZN5Realm21ThreadedTaskScheduler15thread_blockingEPNS_6ThreadE + 1815
  [9] = 0   tasks_and_futures                   0x00000001011ab9f3 _ZN5Realm6Thread18wait_for_conditionINS_23EventTriggeredConditionEEEvRKT_Rb + 211
  [10] = 0   tasks_and_futures                   0x00000001011ab74f _ZNK5Realm5Event15wait_faultawareERb + 743
  [11] = 0   tasks_and_futures                   0x00000001001003c7 _ZNK6Legion8Internal7LgEvent15wait_faultawareERbb + 475
  [12] = 0   tasks_and_futures                   0x00000001000d632b _ZNK6Legion8Internal7ApEvent15wait_faultawareERb + 39
  [13] = 0   tasks_and_futures                   0x000000010074c2c7 _ZN6Legion8Internal10FutureImpl18get_untyped_resultEbPKcbbm + 531
  [14] = 0   tasks_and_futures                   0x00000001000e0c3b _ZNK6Legion6Future18get_untyped_resultEbPKcbm + 207
  [15] = 0   tasks_and_futures                   0x00000001000cf753 _ZNK6Legion6Future13get_referenceIiEERKT_bPKc + 59
  [16] = 0   tasks_and_futures                   0x00000001000cf707 _ZN6Legion19LegionSerialization13StructHandlerIiLb0EE6unpackERKNS_6FutureEbPKc + 51
  [17] = 0   tasks_and_futures                   0x00000001000cf6c7 _ZN6Legion19LegionSerialization6unpackIiEET_RKNS_6FutureEbPKc + 51
  [18] = 0   tasks_and_futures                   0x00000001000c9a37 _ZNK6Legion6Future10get_resultIiEET_bPKc + 51
  [19] = 0   tasks_and_futures                   0x00000001000c9d97 _Z14fibonacci_taskPKN6Legion4TaskERKNSt3__16vectorINS_14PhysicalRegionENS3_9allocatorIS5_EEEEPNS_8Internal11TaskContextEPNS_7RuntimeE + 667
  [20] = 0   tasks_and_futures                   0x00000001000d43ab _ZN6Legion17LegionTaskWrapper19legion_task_wrapperIiXadL_Z14fibonacci_taskPKNS_4TaskERKNSt3__16vectorINS_14PhysicalRegionENS5_9allocatorIS7_EEEEPNS_8Internal11TaskContextEPNS_7RuntimeEEEEEvPKvmSJ_mN5Realm9ProcessorE + 91
  [21] = 0   tasks_and_futures                   0x00000001011e652f _ZN5Realm18LocalTaskProcessor12execute_taskEjRKNS_12ByteArrayRefE + 955
  [22] = 0   tasks_and_futures                   0x0000000100ffd713 _ZN5Realm4Task20execute_on_processorENS_9ProcessorE + 743
  [23] = 0   tasks_and_futures                   0x0000000101003fff _ZN5Realm25KernelThreadTaskScheduler12execute_taskEPNS_4TaskE + 43
  [24] = 0   tasks_and_futures                   0x00000001010028ff _ZN5Realm21ThreadedTaskScheduler14scheduler_loopEv + 1219
  [25] = 0   tasks_and_futures                   0x0000000101003237 _ZN5Realm21ThreadedTaskScheduler20scheduler_loop_wlockEv + 43
  [26] = 0   tasks_and_futures                   0x0000000101019b9b _ZN5Realm6Thread20thread_entry_wrapperINS_21ThreadedTaskSchedulerEXadL_ZNS2_20scheduler_loop_wlockEvEEEEvPv + 99
  [27] = 0   tasks_and_futures                   0x0000000100fd270f _ZN5Realm12KernelThread13pthread_entryEPv + 367
  [28] = 0   libsystem_pthread.dylib             0x000000018212e033 _pthread_start + 135

I'm running macOS 14.3 on an M1 MacBook Pro, using clang 15.0.0 and legion-23.12.0-195-g2832cae66.

I don't get this error when running with any number smaller than 16.

That is is an issue with your OS configuration. See the exact error message:

PTHREAD: pthread_create(&thread, &attr, pthread_entry, this) = 35 (Resource temporarily unavailable)

That's your OS telling you that it's not allowing the creation of more threads. Reconfigure your operating system to allow processes to create more threads. Alternatively check that you don't have too many other threads running in other processes which might cause your OS to limit the number of threads that a process is allowed to make.

I'm not totally convinced that it's the macOS thread limit, since I can compile and successfully run ./index_tasks 100000 in tutorial/02_index_tasks, which uses a lot more threads.

But since ./tasks_and_futures works with an argument of 15 or less and none of the other tutorials seem affected, it's probably related to my system so I'll close the issue.

I'm not totally convinced that it's the macOS thread limit, since I can compile and successfully run ./index_tasks 100000 in tutorial/02_index_tasks, which uses a lot more threads.

Well, whatever it is, the call to pthread_create is returning a non-zero error code and that is a function of your particular implementation of libc or your OS. I also have an M1 and when I build that example with the same configuration flags I have no trouble running it. I did 100 runs without a single failure. The only difference with my configuration is that I'm running MacOS 12.6.7. I strongly advise against ever using the most recent MacOS; they are always notoriously buggy.

DEBUG=1 CXXFLAGS="-g -O2" make

By the way, when you build like this, it doesn't actually build -O2 because we add -O0 after that:

... -g -O2  -DDARWIN -march=native -O0 ...

You could modify runtime.mk, or just build with DEBUG=0.

On my Intel MacBook Pro, running macOS 14.2.1, building with DEBUG=1 and no other settings, I do not see the original reported error. However, I do see:

$ ./tasks_and_futures 17
[0 - 7ff844716b80]    0.000171 {4}{threads}: reservation ('dedicated worker (generic) #1') cannot be satisfied
Computing the first 17 Fibonacci numbers...
Fibonacci(0) = 0 (elapsed = 0.01 s)
Fibonacci(1) = 1 (elapsed = 0.01 s)
Fibonacci(2) = 1 (elapsed = 0.07 s)
Fibonacci(3) = 2 (elapsed = 0.33 s)
[0 - 7000995b8000]    1.017833 {6}{compqueue}: completion queue ID space exhausted!
Abort trap: 6

For whatever it's worth, this Fibonacci code is not at all representative of how actual Legion applications should be written, so I'm not sure how much these errors matter in practice. On the other hand, having our tutorials fail is not a great look either....

I'm not sure how we would fix that without just asking the user to change the Realm config or imposing on the mapper. Legion needs to make several completion queues for each inner task. If you have a ton of inner tasks you'll exhaust the ID space that Realm has to name completion queues. We can give a nicer error message to encourage users to do that, or we can try to get the mapper to execute tasks in a way (using select_task_to_map) to put an upper bound on the number of live tasks at a time which should allow the number of live completion queues to fit under Realm's configuration bound.