swiftwasm/swift

XML parsing with SwiftWasm crashes execution environment

Closed this issue · 3 comments

I'm encountering what appears to be memory corruption using XMLParser from FoundationXML to parse XML payloads when executing via Wasm.

The payloads are SVG. The parser works great when executing natively on macOS, Linux, Windows, iOS, and Android. However, when built for Wasm and executing the parser with a .wasm bundle using Electron, the iOS built-in Wasm execution engine in WKWebView, Wasmtime on Android, or even just vanilla Wasmtime on macOS, I get a crash during, at the end of, or shortly after, parsing an SVG payload. I can get it to happen with simple regular XML payloads as well.

I haven't been able to pin down the issue to a specific spot, but it appears to be in either _CFXMLInterface or libxml2 itself, likely related to attribute parsing or memory management related to attributes.

I've reduced the issue to a small test project that only requires FoundationXML. However, the small test project does not always reproduce the issue, as it is generally invalid memory accesses after parsing (or SVG processing during parsing) that actually cause the crash. To make this reproducible, we've forked wasmtime and gotten its wmemcheck feature to work with Swift's memory allocation scheme. It shows the first memory access violation shortly after the event, when trying to print an element's attributes in the parser delegate's parser(_:didStartElement:) method.

Reproduction on macOS is as follows:

Install SwiftWasm 5.10.0 Release, then create a small executable package:

mkdir xml-test
% cd xml-test
% /Library/Developer/Toolchains/swift-wasm-5.10.0-RELEASE.xctoolchain/usr/bin/swift package init --type executable

Edit main.swift to be:

import Foundation
#if canImport(FoundationXML)
import FoundationXML
#endif

let xml = "<a b='0' />"

let parser = XMLParser(data: xml.data(using: .utf8)!)
class ParserDelegate: NSObject, XMLParserDelegate {
    func parser(_ parser: XMLParser, didStartElement elementName: String, namespaceURI: String?, qualifiedName qName: String?, attributes: [String: String] = [:]) {
        print("parser(_:didStartElement:) elementName: \(elementName) attributes: \(attributes)")
    }
}
let parserDelegate = ParserDelegate()
parser.delegate = parserDelegate

let parseCompleted = parser.parse()
if !parseCompleted {
    print("Parsing failed")
}

Build the project with the SwiftWasm toolchain:

% /Library/Developer/Toolchains/swift-wasm-5.10.0-RELEASE.xctoolchain/usr/bin/swift build --triple wasm32-unknown-wasi

Run .build/debug/xml-test.wasm using e.g. wasmtime, likely will successfully print: parser(_:didStartElement:) elementName: a attributes: ["b": "0"]
Use this fork of wasmtime with its wmemcheck feature configured for use with Swift's memory allocation scheme:

Install Rust (% curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh)
% git clone https://github.com/cobbal/wasmtime
% cd wasmtime
% git checkout cobbal/swift-wasm-wmemcheck
% cargo build --features wmemcheck
% ./target/debug/wasmtime -W wmemcheck ../xml-test/.build/debug/xml-test.wasm

Execution should fail with a stack trace and Invalid store at addr 0x3ab9b8 of size 4, though it appears to be stack corruption, so your result may vary.

Additional history can be found on the swift.org forums: https://forums.swift.org/t/xml-parsing-with-swiftwasm-crashes-execution-environment

We may have found a solution. It appears that we had a single call in our parser that had a stack allocation over 70KB! It is a pretty large function, but far from our largest. It wasn't even being called recursively. If this proves to be the issue, then XMLParser and libxml2 are in the clear.

However, the binary produced by SwiftWasm did not offer any facility to know that a single function call allocated more of Wasm's machine stack than was available, and memory was silently corrupted. So there may still be work that needs to be done to at least warn of this situation, or tools to allow developers to see that this is happening.

We will also be upstreaming the changes we made to wasmtime's wmemcheck tool so it can be used to help debug memory issues in the context of Swift's memory allocation scheme.

@mstokercricut Have you tried increasing the stack size by -z stack-size=<value in bytes>? The default stack size allocated by wasm-ld is 64kb, so 70kb stack allocation would be overflowed.

https://book.swiftwasm.org/getting-started/troubleshooting.html#3-stack-overflow-is-occurring

$ swift build --triple wasm32-unknown-wasi -Xlinker -z -Xlinker stack-size=524288

Turns out that XMLParser, _CFXMLInterface, and libxml2 are all blameless in this, and inappropriate use of a memory debugging tool lead me down a winding path. Somehow doing better stack overflow detection at runtime would be great, though! The harrowing tale can be found here: https://forums.swift.org/t/xml-parsing-with-swiftwasm-crashes-execution-environment