The problem
I have a MacOS app that hosts a content filtering system extension, like SimpleFirewall
.
The app has been in production for a couple years.
I'm working on a new version, and in testing the release candidate, I'm getting a consistent crash that I believe is related to swift concurrency back deployment. Here are the key details:
- building using Xcode 14.2, from a machine running Monterrey, Swift 5.7.2
- crash does not happen when building and testing from Xcode, locally
- crash does not happen on test machine running Ventura
- crash DOES happen always on a test machine running Big Sur
- only the root-user system extension crashes, not the host application
- the new version introduced async/await into the system extension
- crash report shows identical stack trace to well-known issue that had to do with concurrency back deployment
Is there a known issue/limitation with concurrency back deployment in the context of a system extension? Is there any reason why async/await shouldn't work in that context when deployed to Big Sur?
More details, context
The key lines of the crash stack trace are:
0 libswiftCore.dylib 0x00007fff2cdacdc7 swift::ResolveAsSymbolicReference::operator()(swift::Demangle::__runtime::SymbolicReferenceKind, swift::Demangle::__runtime::Directness, int, void const*) + 55
1 libswiftCore.dylib 0x00007fff2cdcf2dd swift::Demangle::__runtime::Demangler::demangleSymbolicReference(unsigned char) + 141
2 libswiftCore.dylib 0x00007fff2cdcc2a8 swift::Demangle::__runtime::Demangler::demangleType(__swift::__runtime::llvm::StringRef, std::__1::function<swift::Demangle::__runtime::Node* (swift::Demangle::__runtime::SymbolicReferenceKind, swift::Demangle::__runtime::Directness, int, void const*)>) + 168
3 libswiftCore.dylib 0x00007fff2cdb25a4 swift_getTypeByMangledNameImpl(swift::MetadataRequest, __swift::__runtime::llvm::StringRef, void const* const*, std::__1::function<swift::TargetMetadata<swift::InProcess> const* (unsigned int, unsigned int)>, std::__1::function<swift::TargetWitnessTable<swift::InProcess> const* (swift::TargetMetadata<swift::InProcess> const*, unsigned int)>) + 516
4 libswiftCore.dylib 0x00007fff2cdafd6d swift::swift_getTypeByMangledName(swift::MetadataRequest, __swift::__runtime::llvm::StringRef, void const* const*, std::__1::function<swift::TargetMetadata<swift::InProcess> const* (unsigned int, unsigned int)>, std::__1::function<swift::TargetWitnessTable<swift::InProcess> const* (swift::TargetMetadata<swift::InProcess> const*, unsigned int)>) + 477
5 libswiftCore.dylib 0x00007fff2cdaff9b swift_getTypeByMangledNameInContext + 171
6 com.myorg.app.filter-extension 0x000000010db2b8b7 0x10db02000 + 170167
7 libdispatch.dylib 0x00007fff20516806 _dispatch_client_callout + 8
8 libdispatch.dylib 0x00007fff2051798c _dispatch_once_callout + 20
9 libswiftCore.dylib 0x00007fff2cdbe16a swift_once + 26
10 com.myorg.app.filter-extension 0x000000010db2c5e3 0x10db02000 + 173539
11 com.myorg.app.filter-extension 0x000000010dbbd708 0x10db02000 + 767752
12 com.myorg.app.filter-extension 0x000000010db073cc 0x10db02000 + 21452
13 com.apple.NetworkExtension 0x00007fff2dfdd4d8 -[NEExtensionProviderContext createWithCompletionHandler:] + 377
14 com.apple.Foundation 0x00007fff215a7c96 __NSXPCCONNECTION_IS_CALLING_OUT_TO_EXPORTED_OBJECT_S1__ + 10
15 com.apple.Foundation 0x00007fff21552b98 -[NSXPCConnection _decodeAndInvokeMessageWithEvent:flags:] + 2271
16 com.apple.Foundation 0x00007fff2150a049 message_handler + 206
17 libxpc.dylib 0x00007fff20406c24 _xpc_connection_call_event_handler + 56
18 libxpc.dylib 0x00007fff20405a9b _xpc_connection_mach_event + 938
The first five lines are identical to an issue from Xcode 13.2.1, discussed in depth on the swift forums:
https://forums.swift.org/t/async-await-crash-on-ios14-with-xcode-13-2-1/54541
...except I'm using Xcode 14.2. Which makes me think that it's not exactly the same bug, but another manifestation of a failure to link against the back-deployed currency lib, possibly having to do with the fact that the system extension isn't able to access the back-deployed lib.
The archived app does have libswift_Concurrency.dylib
at MyApp.app/Contents/Frameworks/libswift_Concurrency.dylib
.
What I've Tried
I tested the workaround in the above mentioned thread, using lipo
to remove arm64
arch, but it didn't work.
I also tested adding -Xllvm -sil-disable-pass=alloc-stack-hoisting
to Other Swift settings
, as suggested in https://developer.apple.com/forums/thread/697070.
I would greatly appreciate any assistance.
@ilis544 I finally got a reply back from DTS, and they provided a working solution. Here's the relevant part, for you and future googlers - I can confirm the workaround mentioned below fixed the issue for my app:
"This isn’t due to a bug with Swift Concurrency, but a side-effect of how System Extensions work, with an easy solution. Your intuition about a simple test case with the Simple Firewall sample project was correct, so I was using that project built with Xcode 14.3.1 and testing on macOS 11.4 to reproduce and test solutions.
Speaking generally of Swift Concurrency, when an app is built with Swift Concurrency code that needs to back deploy, Xcode detects this and deploys the library supporting back deployment to a standard location inside of the app bundle so that any binary inside of the app bundle can refer to that singular copy of the library. This library is weakly linked, which means that at process launch, if the system doesn’t locate the library in one of the standard locations, the system doesn’t abort the process launch due to a missing library.
In the crash here, the top frame is:
0 libswiftCore.dylib 0x00007fff2cdacdc7 swift::ResolveAsSymbolicReference::operator()(swift::Demangle::__runtime::SymbolicReferenceKind, swift::Demangle::__runtime::Directness, int, void const*) + 55
This is the Swift runtime failing to locate a symbol it needs for a some code that is executing, including in other scenarios outside of Swift concurrency. While there have been real issues that Apple has addressed with Swift Concurrency that have this frame in the crash report, what happened here is that the system couldn’t locate the Swift Concurrency symbol because we never loaded the library, because the system couldn’t find it. Since the library is weakly linked, this wasn’t a crash on launch.
System extensions are unique compared to other types of extension points because the copy inside your app is inert and not executed. When the system extension activates through the user consent process, macOS copies it from your app to a location under /Library/SystemExtensions
, and then executes this copy. When the system extension process launches, the search mechanism to locate the Swift Concurrency back deployment library doesn’t find it, because the library is in a shared location inside of your app bundle but outside of the system extension files are are copied. There is no mechanism for connecting the copied system extension back to its original copy inside the app for this library loading purpose, so there is no way for the system to locate the library inside of the original app bundle.
The solution here is two build settings on the system extension target:
-
Set
ALWAYS_EMBED_SWIFT_STANDARD_LIBRARIES
to Yes. This will copy the Swift Concurrency library into the System Extension during the build. This means there will be two copies, one in the standard location for the app for any concurrency needs in your app code, and a second copy inside the system extension for its own needs. -
Ensure that
LD_RUNPATH_SEARCH_PATHS
is not overridden. It should have its default value of$(inherited) @executable_path/../Frameworks @executable_path/../../../../Frameworks
.
This second build setting is to make sure that Xcode inserts /usr/lib/swift
into the right place in the search path list the system uses to locate libraries, so that on newer systems which have the Swift Concurrency library built into the macOS, your app uses that copy instead of the back-deployment library. To confirm the search list order on your system extension, you can run otool -l
, and ensure that /usr/lib/swift
is listed first of the multiple LC_RPATH
entries."