apple/swift-algorithms

Adding a CompactedCollection to an Array triggers EXC_BAD_ACCESS

davbeck opened this issue · 13 comments

When combining a CompactedCollection of objects with an array of objects, in release mode, code will crash with EXC_BAD_ACCESS.

Swift Algorithms version:
1.1.0 but I also saw this in 0.0.3

Swift version:
swift-driver version: 1.87.1 Apple Swift version 5.9 (swiftlang-5.9.0.128.108 clang-1500.0.40.1)
Target: arm64-apple-macosx14.0

Checklist

  • If possible, I've reproduced the issue using the main branch of this package
  • I've searched for existing GitHub issues

Steps to Reproduce

I've attached an iOS project that reproduces this. From the Xcode template, I've edited the scheme to run in Release mode. Here is the relevant code that triggers the crash:

final class Thing {}
let things = [Thing()].compacted() + [Thing()]
print("things", things)

Expected behavior

Not to crash

Actual behavior

EXC_BAD_ACCESS

Thanks for reporting this, @davbeck! Verified in Swift 5.9 – will investigate further.

swift-driver version: 1.87.1 Apple Swift version 5.9 (swiftlang-5.9.0.128.108 clang-1500.0.40.1)
Target: arm64-apple-macosx14.0

Hi, I saw the "help wanted" tag on this and have been investigating. Hope I haven't duplicated anyone's efforts. Here's what I've learned so far.

My own SwiftUI iOS test app Xcode project has similar code to the example above:

struct ContentView: View {
    let things = [Thing()].compacted() + [Thing()]

    var body: some View {
        let _ = print(Unmanaged.passUnretained(things[0]).toOpaque())
        VStack {
            Text(things.description)
        }
        .padding()
    }
}

and crashes with this stack trace when built using Release configuration in Xcode 15.1:

* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x1e)
    frame #0: 0x0000000180082c94 libobjc.A.dylib`objc_opt_class + 16
    frame #1: 0x0000000192cde454 libswiftCore.dylib`swift_getObjectType + 80
    frame #2: 0x0000000192c91b78 libswiftCore.dylib`tryCast(swift::OpaqueValue*, swift::TargetMetadata<swift::InProcess> const*, swift::OpaqueValue*, swift::TargetMetadata<swift::InProcess> const*, swift::TargetMetadata<swift::InProcess> const*&, swift::TargetMetadata<swift::InProcess> const*&, bool, bool) + 844
    frame #3: 0x0000000192c91d94 libswiftCore.dylib`tryCast(swift::OpaqueValue*, swift::TargetMetadata<swift::InProcess> const*, swift::OpaqueValue*, swift::TargetMetadata<swift::InProcess> const*, swift::TargetMetadata<swift::InProcess> const*&, swift::TargetMetadata<swift::InProcess> const*&, bool, bool) + 1384
    frame #4: 0x0000000192c916f8 libswiftCore.dylib`swift_dynamicCast + 172
    frame #5: 0x0000000192a6f264 libswiftCore.dylib`Swift._debugPrint_unlocked<τ_0_0, τ_0_1 where τ_0_1: Swift.TextOutputStream>(τ_0_0, inout τ_0_1) -> () + 156
    frame #6: 0x0000000192b00b94 libswiftCore.dylib`generic specialization <Swift.String> of Swift._debugPrint<τ_0_0 where τ_0_0: Swift.TextOutputStream>(_: Swift.Array<Any>, separator: Swift.String, terminator: Swift.String, to: inout τ_0_0) -> () + 156
  * frame #7: 0x00000001929e5c30 libswiftCore.dylib`merged Swift.Array.description.getter : Swift.String + 384
    frame #8: 0x000000010001f608 swift-algorithms-issue-209`ContentView.body.getter [inlined] closure #1 (self=<unavailable>, self=swift_algorithms_issue_209.ContentView @ x20) -> SwiftUI.Text in swift_algorithms_issue_209.ContentView.body.getter : some at ContentView.swift:18:25 [opt]
    frame #9: 0x000000010001f5f4 swift-algorithms-issue-209`ContentView.body.getter [inlined] generic specialization <SwiftUI.Text> of SwiftUI.VStack.init(alignment: SwiftUI.HorizontalAlignment, spacing: Swift.Optional<CoreGraphics.CGFloat>, content: () -> τ_0_0) -> SwiftUI.VStack<τ_0_0> at <compiler-generated>:0 [opt]
    frame #10: 0x000000010001f5f4 swift-algorithms-issue-209`ContentView.body.getter(self=swift_algorithms_issue_209.ContentView @ x20) at ContentView.swift:17:9 [opt]

Why the crash occurs

[This Swift stdlib and ObjC code is new to me, so let me know if I've gotten any of the details wrong.] The proximate cause of the crash is that the first Thing instance in things—the one initially belonging to the compacted array—has an isa pointer of 0x0000000000000001 when _debugPrint_unlocked tries casting it to various existential types that might provide a printable description. Because the isa pointer isn't 0, but also doesn't pass the SWIFT_ISA_MASK test, swift_getObjectType instead considers whether it might be an Objective-C class instance, and tries obtaining the result of [thing class]. objc_opt_class also applies the SWIFT_ISA_MASK to things[0]'s isa pointer, but then tries to read at an offset of 0x1e from the resulting address of 0, resulting in EXC_BAD_ACCESS.

The first Thing instance's isa pointer is only 0x0000000000000001 when the project is built with the Release build configuration using Xcode 15.0.1 or 15.1 (I haven't tested 15.0). It is a valid address, and the crash doesn't occur, when building Debug configuration with 15.0.1 or greater, or when building with either Debug or Release configuration with Xcode 14.2 or 14.3.1.

Corruption mechanism still unknown

I tried to determine when the isa pointer gets set to 0x0000000000000001, but unfortunately I found that if I set a breakpoint anywhere before the line with things.description, a crash with a different stack trace occurs. The isa pointer has what looks like a valid value—though it is different from the value when building in Debug mode.

I then tried to identify the Swift commit that caused this issue by doing a binary search through the swift-5.9-DEVELOPMENT-SNAPSHOT-<datestamp> tags, building the toolchain from the tag and then testing for the crash using

SWIFT_EXEC=../swift-project/build/Ninja-RelWithDebInfoAssert/swift-macosx-arm64/bin swift run --configuration release

in a directory containing a very simple Swift package with an executable target that does

let twoNewThings = [Thing()].compacted() + [Thing()]
print(twoNewThings.description)

Issue might involve Swift Package Manager

My results were confusingly inconsistent until I realized that the toolchain used to compile didn't matter (as long as it was built from the commits I tried on release/5.9)—it's actually the toolchain used to interpret swift run that matters.

This doesn't crash

$ DEVELOPER_DIR=/Applications/Xcode-14.3.1.app/Contents/Developer SWIFT_EXEC=../swift-project/build/Ninja-RelWithDebInfoAssert/swift-macosx-arm64/bin swift run --configuration release
Building for production...
remark: Incremental compilation has been disabled: it is not compatible with whole module optimization
[2/2] Linking swift-algorithms-issue-209-testing
Build complete! (0.82s)
[swift_algorithms_issue_209_testing.Thing, swift_algorithms_issue_209_testing.Thing]

but this does

$ DEVELOPER_DIR=/Applications/Xcode-15.0.1.app/Contents/Developer SWIFT_EXEC=../swift-project/build/Ninja-RelWithDebInfoAssert/swift-macosx-arm64/bin swift run --configuration release
Building for production...
remark: Incremental compilation has been disabled: it is not compatible with whole module optimization
remark: Incremental compilation has been disabled: it is not compatible with whole module optimization
remark: Incremental compilation has been disabled: it is not compatible with whole module optimization
[5/5] Linking swift-algorithms-issue-209-testing
Build complete! (3.55s)
[1]    56227 segmentation fault  DEVELOPER_DIR=/Applications/Xcode-15.0.1.app/Contents/Developer SWIFT_EXEC=  

even though the custom-built toolchain specified by SWIFT_EXEC is exactly the same (most recently built from swift-5.9-DEVELOPMENT-SNAPSHOT-2023-07-29-a, but I tried several others in my uncompleted binary search).

swift run is handled by Swift Package Manager (if I've read the code correctly), and presumably SPM is similarly involved in how Xcode builds Swift package sources.

Next steps

I compared the logs from builds of my iOS test project in Xcode 14.3.1 vs. 15.1, and noted that for Xcode 15.1 the command line used to compile the Algorithms module has several extra switches

-suppress-warnings
-validate-clang-modules-once
-clang-build-session-file
-emit-const-values
-const-gather-protocols-file

so I'd like to see if any of these could be the cause of the problem.

I'd also like to repeat my binary search through the Swift 5.9 development tags, now that I know how to get reliable results from my test.

And I'd like to study the implementation of swift run to see if there are any obvious explanations there.

I've been working with a more minimal reproducible case which eliminates SPM. It's this command-line macOS program

$ egrep -v "^\/\/" main.swift

import Foundation

final class Thing {}

let twoNewThings = [Thing()].compacted() + [Thing()]
print(twoNewThings.description)

compiled with Compacted.swift copied from swift-algorithms. Using Xcode 15.2 Beta released on 2023-12-12

$ /Applications/Xcode-15.2.0-Beta.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/swiftc -version
swift-driver version: 1.87.3 Apple Swift version 5.9.2 (swiftlang-5.9.2.2.56 clang-1500.1.0.2.5)
Target: arm64-apple-macosx14.0

building without optimization produces a binary that doesn't crash

$ /Applications/Xcode-15.2.0-Beta.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/swiftc -sdk /Applications/Xcode-15.2.0-Beta.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX14.2.sdk main.swift Compacted.swift
$ ./main 
[main.Thing, main.Thing]

But building with -O -whole-module-optimization produces a binary that reproduces the crash.

$ /Applications/Xcode-15.2.0-Beta.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/swiftc -sdk /Applications/Xcode-15.2.0-Beta.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX14.2.sdk main.swift Compacted.swift -O -whole-module-optimization
$ ./main 
[1]    40862 segmentation fault  ./main

The stack trace shows it's the same crash as with my iOS test case

(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x1e)
  * frame #0: 0x00000001878ec48c libobjc.A.dylib`objc_opt_class + 48
    frame #1: 0x000000019793ca64 libswiftCore.dylib`swift_getObjectType + 208
    frame #2: 0x00000001978db06c libswiftCore.dylib`tryCast(swift::OpaqueValue*, swift::TargetMetadata<swift::InProcess> const*, swift::OpaqueValue*, swift::TargetMetadata<swift::InProcess> const*, swift::TargetMetadata<swift::InProcess> const*&, swift::TargetMetadata<swift::InProcess> const*&, bool, bool) + 1192
    frame #3: 0x00000001978db350 libswiftCore.dylib`tryCast(swift::OpaqueValue*, swift::TargetMetadata<swift::InProcess> const*, swift::OpaqueValue*, swift::TargetMetadata<swift::InProcess> const*, swift::TargetMetadata<swift::InProcess> const*&, swift::TargetMetadata<swift::InProcess> const*&, bool, bool) + 1932
    frame #4: 0x00000001978daa50 libswiftCore.dylib`swift_dynamicCast + 208
    frame #5: 0x0000000197628c98 libswiftCore.dylib`Swift._debugPrint_unlocked<τ_0_0, τ_0_1 where τ_0_1: Swift.TextOutputStream>(τ_0_0, inout τ_0_1) -> () + 228
    frame #6: 0x00000001976e983c libswiftCore.dylib`generic specialization <Swift.String> of Swift._debugPrint<τ_0_0 where τ_0_0: Swift.TextOutputStream>(_: Swift.Array<Any>, separator: Swift.String, terminator: Swift.String, to: inout τ_0_0) -> () + 160
    frame #7: 0x000000019757bee8 libswiftCore.dylib`merged Swift.Array.description.getter : Swift.String + 504
    frame #8: 0x0000000100002f24 main`main + 372
    frame #9: 0x00000001879250e0 dyld`start + 2360

The crash can also be reproduced using the toolchains and macOS SDKs from various other Xcode 15 releases I've tried, including Xcode 15.0 Beta, Xcode 15.0.1, and Xcode 15.1. It can't be reproduced with Xcode 14.3.1 or Xcode 14.2, even with -O -whole-module-optimization.

So I can confirm the issue is caused by -O -whole-module-optimization in Xcode 15.x toolchains, which I realize isn't that much more info than given in the original issue description. 😂

I'm still working on narrowing down the range of commits that may have introduced the problem, but I haven't yet built a custom toolchain that reproduces the crash using the above test case. I've tried various release/5.9 tags, including swift-5.9.2-RELEASE, which I would think is pretty close to the Xcode 15.2 Beta toolchain. Still experimenting with utils/build-script arguments, using Swift build bot logs as a guide. But so far the toolchains I've built locally differ in some important way from the toolchains released with various Xcodes 15.

I found a set of utils/build-script options that can successfully build toolchains from commits ranging from swift-5.8.1-RELEASE to swift-5.9.2-RELEASE:

utils/build-script --skip-build-benchmarks --release-debuginfo --darwin-toolchain-require-use-os-runtime=0 --assertions --swift-enable-ast-verifier=0 --no-swift-stdlib-assertions --install-swift --install-swiftsyntax --install-destdir ../installs/$(git rev-parse --short=7 HEAD) --reconfigure --clean --clean-install-destdir

Using that with the test case mentioned above, I find that the issue seems to have been introduced between swift-DEVELOPMENT-SNAPSHOT-2022-12-21-a and swift-DEVELOPMENT-SNAPSHOT-2022-12-29-a.

# swift-DEVELOPMENT-SNAPSHOT-2022-12-29-a
$ ../swift-project/installs/71c62c0/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/swiftc main.swift Compacted.swift -sdk /Applications/Xcode-14.3.1.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk -O -whole-module-optimization 
$ ./main 
[1]    17175 segmentation fault  ./main

# swift-DEVELOPMENT-SNAPSHOT-2022-12-21-a
$ ../swift-project/installs/3d36113/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/swiftc main.swift Compacted.swift -sdk /Applications/Xcode-14.3.1.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk -O -whole-module-optimization 
$ ./main 
[main.Thing, main.Thing]

According to the output from utils/update-checkout, the following projects differ between those two snapshots:

llvm-project:               d4258b1 -> ab856b0
swift:                      3d36113 -> 71c62c0
swift-docc:                 150eb7d -> 496639f
swift-docc-render-artifact: 252584a -> 4abdb66
swift-syntax:               090adb4 -> a2d31e8
swiftpm:                    754a55f -> 14d05cc

I'm guessing the most fruitful projects to investigate are llvm-project and swift. I might also try to narrow the commit range further between the two snapshot tags.

After testing main branch commits between DEVELOPMENT-SNAPSHOT-2022-12-21-a and swift-DEVELOPMENT-SNAPSHOT-2022-12-29-a, it seems the commit that introduced this issue is 004d0b1, "Merge pull request #61715 from eeckstein/alias-analysis".

My testing strategy was to checkout the desired main branch commit in the swift project, then run

utils/update-checkout --match-timestamp --scheme main

then build with

utils/build-script --skip-build-benchmarks --release-debuginfo --darwin-toolchain-require-use-os-runtime=0 --assertions --swift-enable-ast-verifier=0 --no-swift-stdlib-assertions --install-swift --install-swiftsyntax --install-destdir ../installs/$(git rev-parse --short=7 HEAD) --reconfigure --clean --clean-install-destdir

Here are the results for 004d0b1 and the commit immediately preceding it on main, 77527c9:

$ ../swift-project/installs/004d0b1/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/swiftc main.swift Compacted.swift -sdk /Applications/Xcode-14.3.1.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk -O -whole-module-optimization 
$ ./main 
[1]    93784 segmentation fault  ./main

$ ../swift-project/installs/77527c9/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/swiftc main.swift Compacted.swift -sdk /Applications/Xcode-14.3.1.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk -O -whole-module-optimization 
$ ./main 
[main.Thing, main.Thing]

The update-checkout command resulted in a different commits for llvm-project for those two toolchain builds, so I also tested a build of swift 004d0b1 with llvm-project 0146e80, the same llvm-project commit as for swift 77527c9. The test binary built with that toolchain also crashed, so I believe the only relevant difference is in the swift project. But here is the minimum set of project differences between a toolchain I tested that doesn't reproduce the issue, and one that does.

llvm-project: 0146e804f05381fb24119080c1e8c4777319920d [same for both]
swift: 77527c9e41b3d71501bf35fe85bf0a4d58b7fcce -> 004d0b13530ac9cb3793d553b0ffc028e75f9d3e
swift-docc: 150eb7d295b65e8e66495c66517c98443918303b -> 496639f93275d783a2e7bdc31ad94a159b4cc963
swiftpm: 723a5925412f756333a77fee74aedae13569814e -> 14d05ccaa13b768449cd405fff81d630a520e04a

I also built a toolchain identical to the one that reproduces the issue, except with alias analysis disabled (if I'm understanding the code, which is definitely not guaranteed 😄) by the following diff

diff --git a/SwiftCompilerSources/Sources/Optimizer/PassManager/PassRegistration.swift b/SwiftCompilerSources/Sources/Optimizer/PassManager/PassRegistration.swift
index 032296e4e3b..baf1eee354c 100644
--- a/SwiftCompilerSources/Sources/Optimizer/PassManager/PassRegistration.swift
+++ b/SwiftCompilerSources/Sources/Optimizer/PassManager/PassRegistration.swift
@@ -79,6 +79,6 @@ private func registerSwiftPasses() {
 }
 
 private func registerSwiftAnalyses() {
-  AliasAnalysis.register()
+//  AliasAnalysis.register()
   CalleeAnalysis.register()
 }

That toolchain did not reproduce the issue, so I'm focusing further investigation on the new alias analysis.

Unfortunately, it's still the case that when I step through the instructions of a crashing binary from the beginning of main, to try to see when the compacted Thing is corrupted, I don't see the corruption, and it crashes differently than when I wait to break until the call to description, or let it run without breakpoints.

@davbeck / @toddthomas – thanks for your work looking into this! It looks like there were some overreleases fixed in 5.10. Could I ask you to verify that your test cases work correctly with a 5.10 or later compiler? I'm seeing my reproducer fixed on the latest toolchain (version 5.11-dev).

I still get the crash using a toolchain built from the latest tag I see on release/5.10 (swift-5.10-DEVELOPMENT-SNAPSHOT-2024-01-18-a), using the toolchain build command and test case compile command I posted above.

As with my previous testing, I only get the crash when the test case is compiled with -O -whole-module-optimization.

No crash with no optimization, no crash with only -O, no crash with -Osize or -Osize -whole-module-optimization.

I'll try latest main.

A toolchain built from the most recent tag on main, swift-DEVELOPMENT-SNAPSHOT-2024-01-18-a does not reproduce the issue.

$ ../swift-project/installs/99e9db8/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/swiftc main.swift Compacted.swift -sdk /Applications/Xcode-14.3.1.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk -O -whole-module-optimization 
$ ./main
[1]    11677 segmentation fault  ./main
$ ../swift-project/installs/517d187/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/swiftc main.swift Compacted.swift -sdk /Applications/Xcode-14.3.1.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk -O -whole-module-optimization
$ ./main
[main.Thing, main.Thing]

(where 99e9db8 is swift-5.10-DEVELOPMENT-SNAPSHOT-2024-01-18-a and 517d187 is swift-DEVELOPMENT-SNAPSHOT-2024-01-18-a).

On main the fix happens somewhere between swift-DEVELOPMENT-SNAPSHOT-2023-11-13-a and swift-DEVELOPMENT-SNAPSHOT-2024-01-03-a. I'll update when I've found it.

Commit d93e65d, the merge of PR 70710, is the fix.

I don't see the diff in lib/SILOptimizer/Utils/Generics.cpp from that commit on release/5.10, which corresponds with my negative test result for that branch.

There's something important in the command used to build the toolchain.

If I use a build command that's basically the one from the getting started guide, with options added to install the toolchain:

utils/build-script --skip-build-benchmarks --skip-ios --skip-watchos --skip-tvos --swift-darwin-supported-archs "$(uname -m)" --release-debuginfo --swift-disable-dead-stripping --install-swift --install-swiftsyntax --install-destdir ../installs/$(git rev-parse --short=7 HEAD)-getting-started-guide --reconfigure --clean --clean-install-destdir

then regardless of what commit I build from, the resulting toolchain doesn't build a test binary that reproduces this issue.

But if I use a toolchain build command I simplified a bit from what I see in logs on ci.swift.org:

utils/build-script --skip-build-benchmarks --release-debuginfo --darwin-toolchain-require-use-os-runtime=0 --assertions --swift-enable-ast-verifier=0 --no-swift-stdlib-assertions --install-swift --install-swiftsyntax --install-destdir ../installs/$(git rev-parse --short=7 HEAD) --reconfigure --clean --clean-install-destdir

then the resulting toolchain, when built from commits in the range [004d0b1, 86d9e2c], can build crashing test binaries.

I haven't narrowed down the exact combination of toolchain build command options necessary to demonstrate this issue. These clean builds are time consuming. 😅

@natecook1000 I noticed Xcode 15.3 Beta has a Swift 5.10 toolchain, so I built my test case with it, which crashed. That aligns with what I've seen with the toolchains I've built myself. The fix is on main, but not release/5.10.

$ /Applications/Xcode-15.3.0-Beta.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/swiftc -version 
swift-driver version: 1.90.8 Apple Swift version 5.10 (swiftlang-5.10.0.10.5 clang-1500.3.7.4)
Target: arm64-apple-macosx14.0
$ /Applications/Xcode-15.3.0-Beta.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/swiftc main.swift Compacted.swift -sdk /Applications/Xcode-15.3.0-Beta.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk -O -whole-module-optimization
$ ./main
[1]    3415 segmentation fault  ./main