jrouwe/JoltPhysics

Compiling with double-precision for macOS <10.14 fails

Closed this issue · 2 comments

mihe commented

This one has me scratching my head a bit, but it seems like there's something going on with the collector classes in Character and CharacterVirtual when compiling double-precision builds for macOS >=10.12 (Sierra), where I get the following errors:

Jolt/Physics/Character/Character.cpp:135:8: error: aligned deallocation function of type 'void (void *, std::align_val_t) noexcept' is only available on macOS 10.14 or newer
        class MyCollector : public CollideShapeCollector
              ^
Jolt/Physics/Character/Character.cpp:139:14: note: in implicit destructor for 'MyCollector' first required here
                explicit                        MyCollector(Vec3Arg inUp, RVec3 inBaseOffset) : mBaseOffset(inBaseOffset), mUp(inUp) { }
                                                ^
Jolt/Physics/Character/Character.cpp:135:8: note: if you supply your own aligned allocation functions, use -faligned-allocation to silence this diagnostic
        class MyCollector : public CollideShapeCollector
              ^
Jolt/Physics/Character/CharacterVirtual.h:363:8: error: aligned deallocation function of type 'void (void *, std::align_val_t) noexcept' is only available on macOS 10.14 or newer
        class ContactCollector : public CollideShapeCollector
              ^
Jolt/Physics/Character/CharacterVirtual.cpp:235:19: note: in implicit destructor for 'JPH::CharacterVirtual::ContactCollector' first required here
        ContactCollector collector(mSystem, this, mMaxNumHits, mHitReductionCosMaxAngle, mUp, mPosition, outContacts);
                         ^
Jolt/Physics/Character/CharacterVirtual.h:363:8: note: if you supply your own aligned allocation functions, use -faligned-allocation to silence this diagnostic
        class ContactCollector : public CollideShapeCollector
              ^
Jolt/Physics/Character/CharacterVirtual.h:381:8: error: aligned deallocation function of type 'void (void *, std::align_val_t) noexcept' is only available on macOS 10.14 or newer
        class ContactCastCollector : public CastShapeCollector
              ^
Jolt/Physics/Character/CharacterVirtual.cpp:348:23: note: in implicit destructor for 'JPH::CharacterVirtual::ContactCastCollector' first required here
        ContactCastCollector collector(mSystem, this, inDisplacement, mUp, inIgnoredContacts, start.GetTranslation(), contact);
                             ^
Jolt/Physics/Character/CharacterVirtual.h:381:8: note: if you supply your own aligned allocation functions, use -faligned-allocation to silence this diagnostic
        class ContactCastCollector : public CastShapeCollector
              ^

The repro should hopefully be as simple as compiling on macOS with the following additional CMake defines:

-DDOUBLE_PRECISION=YES
-D"CMAKE_OSX_DEPLOYMENT_TARGET=10.12"
-D"CMAKE_OSX_ARCHITECTURES=x86_64;arm64"

(I don't have access to a macOS machine at the moment, so this is purely from messing around with Godot Jolt through GitHub Actions.)

From what I understand these errors have to do with the alignment of these classes being greater than sizeof(max_align_t), which supposedly causes the compiler to rely on the global aligned new/delete operators in some generated code, which isn't available prior to macOS 10.14. What's weird about this is that I believe sizeof(max_align_t) is 8 bytes on Apple Silicon, so I feel like I should be seeing these errors even without DOUBLE_PRECISION=YES, but I don't.

Seeing as how these classes are only ever stack-allocated, and any heap-allocated type uses the class operators through JPH_OVERRIDE_NEW_DELETE anyway, I assume it would be relatively safe to use the -faligned-allocation flag as suggested in the error messages, but I figured I'd check to see if you had any other ideas/theories.

EDIT: I also tried adding JPH_OVERRIDE_NEW_DELETE to these collector classes, but it made no difference.

EDIT 2: Thinking about it some more, I guess the reason this doesn't show up without DOUBLE_PRECISION=YES is because Apple Silicon probably implies a minimum macOS version of 11 (Big Sur) and x64 has a sizeof(max_align_t) of 16, which would line up with JPH_VECTOR_ALIGNMENT and thus the collectors wouldn't be over-aligned, maybe?

So if I understand the issue correctly, the problem occurs on x86_64 only and only on macOS 10.12. I'm not sure what the stack alignment on macOS is but it is indeed most likely 16. The warning is then probably triggered because the compiler cannot guarantee that the struct will be aligned to 32 bytes. If this is a problem or not really depends on the instructions that the compiler generates, in the old days compilers generated 'load aligned' (e.g. _mm256_load_pd) instructions which would break if the object wasn't 32 byte aligned. I think modern compilers don't do this anymore (they emit e.g. _mm256_loadu_pd), so I would expect that ignoring the warning would not really cause problems.

B.t.w. it doesn't help for this particular case because they only support Apple Silicon macs, but I don't own a mac either and I use scaleway to rent a virtual mac for day if I need one.

mihe commented

and only on macOS 10.12

Technically when targeting any version prior to 10.14 I guess, but yes, 10.12 and 10.13 in this case.

The warning is then probably triggered because the compiler cannot guarantee that the struct will be aligned to 32 bytes

I'm out of my depth here, but I don't get the impression that this is what the problem is. Judging by the error I get the impression that the compiler has some generated code in the implicit destructor (although I can't fathom what exactly) that makes use of the global delete that takes a std::align_val_t, which from what I've gathered it would have to use for deleting anything with an alignment larger than sizeof(max_align_t), as is the case here. macOS/OSX 10.12 doesn't seem to provide these in its C++ runtime, presumably for the same reason it didn't provide aligned_alloc prior to 10.15, as mentioned in #738, which I guess makes some kind of sense since they're both relatively recent additions (C++17).

Either way, given that this would probably require testing with whatever C++ runtime 10.12 shipped with, I don't think either one of us will be able to definitively know if any fix is truly viable, but I'd probably be fine with adding -faligned-allocation either on my end or yours and call it a day.

B.t.w. it doesn't help for this particular case because they only support Apple Silicon macs, but I don't own a mac either and I use scaleway to rent a virtual mac for day if I need one.

Ah, good to know, thank you!