Thank you @eskimo
I was digging more and I found the source of problem and repro[1] - in the dynamic library we load some static intializer is creating a nw_path_monitor and once we fork the process it crashes in the atfork handler when Network frameworks tries to cleanup. I'll report a bug and see if someone tells me if it's operator error or just a bug :-)
I would still have a one more question for educational purposes: Are the the posix_spawn and NSTask doing something fundamentally different to just calling fork and exec*? I mean calling a mach APis or other dark magic?
[1] Simple main.m (used application template from Xcode)
// main.m
#import <Cocoa/Cocoa.h>
#import <Network/Network.h>
#import <dispatch/dispatch.h>
int main(int argc, const char * argv[]) {
@autoreleasepool {
nw_path_monitor_t mon = nw_path_monitor_create();
nw_path_monitor_set_update_handler(mon, ^(nw_path_t path) {
NSLog(@"monitor updated");
});
nw_path_monitor_start(mon);
}
pid_t pid = fork();
if (pid == -1) {
NSLog(@"Fork failed");
exit(1);
}
if (pid == 0) {
while (true) {
NSLog(@"Forked child here");
sleep(1);
}
return 0;
}
return NSApplicationMain(argc, argv);
}
Child crashes with:
Process: NWForkCrash [69516]
Path: /Users/USER/Library/Developer/Xcode/DerivedData/NWForkCrash-dnvdeuuhbnuhxublasbhfmmluzqb/Build/Products/Debug/NWForkCrash.app/Contents/MacOS/NWForkCrash
Identifier: com.****.NWForkCrash
Version: 1.0 (1)
Code Type: ARM-64 (Native)
Parent Process: NWForkCrash [69508]
Responsible: NWForkCrash [69508]
User ID: 501
Date/Time: 2023-09-15 16:52:56.2815 +0200
OS Version: macOS 13.5.2 (22G91)
Report Version: 12
Anonymous UUID: D6E5A34D-2127-16AF-16E7-BDA9139A6A82
Sleep/Wake UUID: DF75986B-D513-4000-993D-69A7AB7261A1
Time Awake Since Boot: 88000 seconds
Time Since Wake: 740 seconds
System Integrity Protection: enabled
Crashed Thread: 0 Dispatch queue: com.apple.main-thread
Exception Type: EXC_BREAKPOINT (SIGTRAP)
Exception Codes: 0x0000000000000001, 0x0000000194551238
Termination Reason: Namespace SIGNAL, Code 5 Trace/BPT trap: 5
Terminating Process: exc handler [69516]
Application Specific Information:
BUG IN CLIENT OF LIBPLATFORM: os_unfair_lock is corrupt
Abort Cause 258
crashed on child side of fork pre-exec
Thread 0 Crashed:: Dispatch queue: com.apple.main-thread
0 libsystem_platform.dylib 0x194551238 _os_unfair_lock_corruption_abort + 88
1 libsystem_platform.dylib 0x19454c788 _os_unfair_lock_lock_slow + 332
2 Network 0x19b1b4af0 nw_path_shared_necp_fd + 124
3 Network 0x19b1b4698 -[NWConcrete_nw_path_evaluator dealloc] + 72
4 Network 0x19af9d970 __nw_dictionary_dispose_block_invoke + 32
5 libxpc.dylib 0x194260210 _xpc_dictionary_apply_apply + 68
6 libxpc.dylib 0x19425c9a0 _xpc_dictionary_apply_node_f + 156
7 libxpc.dylib 0x1942600e8 xpc_dictionary_apply + 136
8 Network 0x19acd5210 -[OS_nw_dictionary dealloc] + 112
9 Network 0x19b1beb08 nw_path_release_globals + 120
10 Network 0x19b3d4fa0 nw_settings_child_has_forked() + 312
11 libsystem_pthread.dylib 0x10463f7c8 _pthread_atfork_child_handlers + 76
12 libsystem_c.dylib 0x1943d9944 fork + 112
13 NWForkCrash 0x1045db024 main + 96 (main.m:21)
14 dyld 0x1941c7f28 start + 2236
Post
Replies
Boosts
Views
Activity
I think it’s reasonable to argue that com.apple.security.get-task-allow should be allowed to facilitate debugging. If you agree, make your case in a bug report.
Agree, suggested that in case FB13163106, hopefully someone finds it out worth of implementation :)
Thank you @eskimo for quick answer :-)
Please post your bug number, just for the record.
Here it is FB13136758.
If removing these entitlements helps with this problem — and I’m not 100% sure it will (...)
It does help because I can then explicitly add com.apple.security.get-task-allow which will otherwise prevent the helper from starting if com.apple.security.inherit is present. Not sure if this is bug or feature :-) The docs are clear that any other entitlement than those two for sandbox inheritance will abort the application start but that is not entirely true already.
For anyone having the same problem:
Solution for me wast to remove the com.apple.security.app-sandbox and com.apple.security.inherit and add com.apple.security.get-task-allow so I can debug the helper which will still be running in parent's sandbox. For production builds I reverse that and it should work for Mac App Store too hopefully without any unexpected side-effects.
Cool. Then assigning someone to automate this shouldn’t be a problem, right? (-:
Haha :-D
I'll share what we did in case someone trips over the same thing.
I did took and advantage of the fact that our test hosts are deployed with full disk access allowed for Terminal application (which runs the test runner) so I modify user's TCC.db to allow System Settings automation and then in the test script when I need screen recording permission I run automation[1] script which adds the application under test to screen recording allowlist. Not happy with that, I would rather run sql insert into TCC.db in the fraction of the time but it works somehow reliably and we don't need to disable SIP.
[1] The screen recording permissions are stored in system-wide /Library/Application Support/com.apple.TCC which has rootless attribute and can't be modified directly without disabling SIP
Thank you Eskimo,
It’s complicated, but a key factor is your app’s designated requirement. See TN3127 Inside Code Signing: Requirements.
Do I then understand correctly that TCC uses application's designate requirements for setting up the initial access.csreq field? If I change the field will it just evaluate the requirements against the application or it also compares the access.csreq with actual the DR of the application (i.e. they match)?
This doesn’t make sense to me. Ad hoc signed code can’t use an App ID because an App ID must be authorised by a provisioning profile and a profile can only authorise code signed with signing identity whose certificate was issued by Apple. See TN3125 Inside Code Signing: Provisioning Profiles.
I'm probably using wrong term here - I should rather use a "bundle identifier". Our build environment produces macOS application bundles, which have the same bundle identifier but one is ad-hoc signed (codesign -s - ...) with restricted entitlements stripped, which is used on PR pipelines[1]. The other one is properly signed with DeveloperID distribution certificate. Problem is that machines which run the UI tests are fetching from both of these queues and even when we split them, the churn there (install/run/uninstall/repeat) seems to break a TCC a lot.
[1] This is weird but we have a few reasons for that where some are:
Our security policy is that we don't sign with production distribution certificates anything which was not reviewed and merged to master .
Using different bundle identifier is also not possible because many of the third-part integrations which we're also testing depend on the exact bundle id.
Running code unsigned on intel is also not an option - we still need to stick entitlements on executables so we don't run into problems with sandbox later.
Having provisioning profile and shared signing identity for the PR builds has a high maintenance cost because we would need to constantly update the provisioning profile with a new machine uuids (we're not small company, we're roughly of size of yours :))
I strongly suggest building a non-Electron proof-of-concept (...) That's what I did, everything I'm describing here was done on pure Mac app (no Electron in the sight), but the final thing is of course integrated with Electron. I mentioned the Electron just to illustrate why we can't just re-build our app.
I don't know of any built-in API that is going to give you that guarantee. As far as I know the first part is achievable with raw mach ports (you can instruct kernel to add audit token to message trailer), there is just bunch of annoying bookkeeping with service bootstrap.
I think I got my answer - there is simply no better solution to my problem, from security or practicality point of view, than raw mach ports :(
Thank you Etresoft, really appreciate you spent the time on this.
You will need to expand on that aspect. The UI of the Main app is Electron, the infamous Frankenstein's monster of node.js and Chromium, but the code which is handling all macOS integration bits and pieces is our native node.js add-on written int Objective-C/C++.
Because the app is still sandboxed, I think the sandbox will not allow public access to your message port. I got the same impression, but after trying that [1] I found out that the hole is open for everyone who knows the port name, which is trivial to get. The source code from 2015 of CFMessagePort.c [2] verifies my assumption - it is registering the port as bootstrap service under the port name with bootstrap_register(), so basically only sandboxed apps outside of app group are restricted from accessing it :)
I'm not sure what kind of built-in security you are looking for. At minimum level we need to be able to verify the sender identity (i.e. if the sender is app signed by us). Ideal situation would be to have fully private communication channel secured by app signature on the OS level.
I believe I can achieve the first requirement by using raw mach messaging (with MACH_RCV_TRAILER_AUDIT) but this stuff is not exactly well documented (or I'm looking to wrong places) and I'm afraid that I will sooner or later hit something which Apple considers private API, effectively preventing the app from being accepted to App Store :(
[1] - I wrote command line app (unsigned - without entitlements) and was able to open the port with CFMessagePortCreateRemote() and communicate without any problem. (on Big Sur 11.1 (20C69) with Xcode 12.3 (12C33))
[2] - https://opensource.apple.com/source/CF/CF-1151.16/CFMessagePort.c.auto.html
That's good question - normal XPC service is not a fit for the task. We have an application (the main application) which represents some kind of account and recently we got ability to log in into another account, but that's confusing for users so we need to separate the second account to be something which is sitting in the Dock and user can click it, pin it, etc. For that purpose I created dummy app (the helper) which is just sitting in the dock doing nothing[1] and when activated, it sends message to main application which takes over and presents proper UI to the user[2].
Notes:
[1] - We do lot more than that - main app configures badge and menu items of the helper app as well as helper app is sending more events that just the activation, e.g. hide, terminated, menu item action, etc.
[2] - I'll not deny that this is bonkers but our other option is to cut into slices the existing application, which is built as monolith and thank's to it's multi-platform nature that would take forever...so I would rather wait for next rewrite :)