Post

Replies

Boosts

Views

Activity

Reply to Watchdog Temination: all threads locked in libobjc
I believe I've identified the issue and indeed Thread 12 and 19 seem to be the culprits, specifically a deadlock between them. The issue apperars to be caused by the interaction between libobjc and libdyld. When I reference files/line numbers here I'm looking at the source for objc4-818.2 and dyld-832.7.1 which I believe are the latest open source versions available. As a reminder, this is what threads 12 and 19's frames looked like: Thread 12 name:&#9;Dispatch queue: com.google.FIRCoreDiagnostics Thread 12: 0&#9; libsystem_kernel.dylib&#9;&#9;&#9;&#9; 0x00000001c6703f5c __ulock_wait + 8 1&#9; libsystem_platform.dylib&#9;&#9;&#9; 0x00000001e2c150cc _os_unfair_lock_lock_slow + 196 2&#9; libdyld.dylib&#9;&#9;&#9;&#9;&#9;&#9;&#9;&#9; 0x0000000199445ff4 dyld3::AllImages::infoForImageMappedAt(void const*, void + 57332 (dyld3::LoadedImage const&, unsigned char) block_pointer) const + 204 3&#9; libdyld.dylib&#9;&#9;&#9;&#9;&#9;&#9;&#9;&#9; 0x0000000199445ec0 dyld3::AllImages::pathForImageMappedAt+ 57024 (void const*) const + 368 4&#9; libdyld.dylib&#9;&#9;&#9;&#9;&#9;&#9;&#9;&#9; 0x000000019944b658 dyld3::dyld_image_path_containing_address+ 79448 (void const*) + 60 5&#9; libobjc.A.dylib&#9;&#9;&#9;&#9;&#9;&#9;&#9; 0x00000001add6b248 objc_copyImageNames + 152 6&#9; App&#9;&#9;&#9;&#9;&#9;&#9;&#9;&#9;&#9;&#9;&#9;&#9; &#9;0x00000001052389d8 FIRPopulateProtoWithNumberOfLinkedFrameworks + 17205720 (FIRCoreDiagnostics.m:483) … Thread 19 name:&#9;Dispatch queue: com.apple.CFNetwork.Connection Thread 19: 0&#9; libsystem_kernel.dylib&#9;&#9;&#9;&#9; 0x00000001c6703f5c __ulock_wait + 8 1&#9; libsystem_platform.dylib&#9;&#9;&#9; 0x00000001e2c150cc _os_unfair_lock_lock_slow + 196 2&#9; libobjc.A.dylib&#9;&#9;&#9;&#9;&#9;&#9;&#9; 0x00000001add6c174 lookUpImpOrForward + 152 3&#9; libobjc.A.dylib&#9;&#9;&#9;&#9;&#9;&#9;&#9; 0x00000001add56524 _objc_msgSend_uncached + 68 4&#9; libxpc.dylib&#9;&#9;&#9;&#9;&#9;&#9;&#9;&#9;&#9; 0x00000001e2c5a604 -[OS_xpc_object dealloc] + 56 5&#9; libdyld.dylib&#9;&#9;&#9;&#9;&#9;&#9;&#9;&#9; 0x0000000199445404 invocation function for block in dyld3::AllImages::runImageCallbacks+ 54276 (dyld3::Array<dyld3::LoadedImage> const&) + 820 6&#9; libdyld.dylib&#9;&#9;&#9;&#9;&#9;&#9;&#9;&#9; 0x00000001994449a0 dyld3::AllImages::runImageCallbacks+ 51616 (dyld3::Array<dyld3::LoadedImage> const&) + 172 7&#9; libdyld.dylib&#9;&#9;&#9;&#9;&#9;&#9;&#9;&#9; 0x000000019944a2f0 dyld3::AllImages::loadImage+ 74480 (Diagnostics&, char const*, unsigned int, dyld3::closure::DlopenClosure const*, bool, bool, bool, bool, void const*) + 744 8&#9; libdyld.dylib&#9;&#9;&#9;&#9;&#9;&#9;&#9;&#9; 0x0000000199449e2c dyld3::AllImages::dlopen+ 73260 (Diagnostics&, char const*, bool, bool, bool, bool, bool, void const*, bool) + 904 9&#9; libdyld.dylib&#9;&#9;&#9;&#9;&#9;&#9;&#9;&#9; 0x000000019944bd14 dyld3::dlopen_internal+ 81172 (char const*, int, void*) + 372 10&#9;libdyld.dylib&#9;&#9;&#9;&#9;&#9;&#9;&#9;&#9; 0x000000019943dd44 dlopen_internal+ 23876 (char const*, int, void*) + 112 11&#9;libnetwork.dylib&#9;&#9;&#9;&#9;&#9;&#9;&#9; 0x000000019a8b6a6c __nw_protocol_get_tcp_image_block_invoke + 64 … 20&#9;libnetwork.dylib&#9;&#9;&#9;&#9;&#9;&#9;&#9; 0x000000019a65ea94 nw_parameters_create_secure_tcp + 4672 21&#9;CFNetwork&#9;&#9;&#9;&#9;&#9;&#9;&#9;&#9;&#9;&#9; 0x0000000199ebaf88 0x199e00000 + 765832 … On thread 19 dlopen_internal has been called, in relation to CFNetwork, this calls into libdyld and in dyld3::AllImages::runImageCallbacks libdyld's global lock (AllImages.h:255) is taken through a call to withNotifiersLock (AllImages.cpp:382). As part of runImageCallbacks there are a number of notifier blocks to be called, before or during this occurring we context switch to thread 12. Thread 12 calls objc_copyImageNames, this locks libobjc's runtime lock (objc-runtime-new.mm:5521) before calling into libdyld through a call to fname (objc-runtime-new.mm:5543, defined in objc-private.h:505). This call into libdyld goes on to eventually call into AllImages::infoForImageMappedAt where it attempts to take libdyld's global lock (AllImages.cpp:678) through a call to withReadLock. This thread (12) now waits here as thread 19 has already acquired libdyld's global lock. Both libdyld's global lock and libobjc's runtime lock are now locked by separate threads (19 and 12 respectively). We context switch back to thread 19, the notifier blocks are now called, one of which calls dealloc on an OS_xpc_object, within which a call that requires Objective-C method dispatch is made, calling into lookUpImpOrForward which then attempts to take libobjc’s runtime lock (objc-runtime-new.mm:6427) but cannot as it is already locked by thread 12. Thread 19 is now waiting on a lock acquired by thread 12 (libobjc's runtime lock) and thread 12 is now waiting on a lock acquired by thread 19 (libdyld's global lock). We're now deadlocked and the system watchdog eventually terminates the process. I've filed this as bug report FB8971497. Let me know if this sounds about right though. I suspect we’ve begun to see this issue occurring more in our app recently as we’ve began modularising which has increased the number of dynamic libraries that we link. As this is a concurrency & timing reliant deadlock it’s potentially the case that the increased number of dynamic libraries is resulting in thread 12's (com.google.FIRCoreDiagnostics) call to objc_copyImageNames taking longer when previously it would have finished by the time thread 19 calls dlopen.
Jan ’21
Reply to Change in iOS 14 Beta 3 to trigger 0xdead10cc
The issue here is that previously working code is now being hit by this check. I took a look at our investigation of this (r. 66931425) but it’s too early for me to post any concrete details. I hope that’ll change soon (-: Unfortunately the response on my bug report FB8128103 doesn't seem to suggest any investigation and despite it being a binary compatibility issue the response I received, in the form of the following message, asked me to close my report: This is an issue specific to a third-party, not an Apple issue. This is a Realm bug that they’re tracking: https://github.com/realm/realm-cocoa/issues/6671 Please contact Realm for further support. Please close your feedback report, or let us know if this is still an issue for you. I've responded accordingly re. binary compatibility though so fingers crossed this gets another look.
Aug ’20
Reply to Change in iOS 14 Beta 3 to trigger 0xdead10cc
Thanks Quinn, I stumbled upon some C code for that and was writing it up in Swift too, thanks for saving me the time and helping me debug the cause. I was able to narrow down the file that is triggering the problem as I was able to reproduce this on my personal device so I could see the relevant Console log regarding suspension and termination: [application&lt;…&gt;:3879] Terminating with context: &lt;RBSTerminateContext| domain:15 code:0xDEAD10CC explanation:[application<…&gt;:3879] was suspended with locked system files: /var/mobile/Containers/Shared/AppGroup/A66EB78A-2BBC-49D4-BDEA-6A2AF7E8A5A6/default.realm.lock not in allowed directories: /var/mobile/Containers/Data/Application/E1435A44-ABC6-4254-B547-B5423D9FCAB1 /var/mobile/Containers/Data/Application/E1435A44-ABC6-4254-B547-B5423D9FCAB1/tmp reportType:CrashLog maxTerminationResistance:Interactive> This points to Realm's default.realm.lock being the locked file which is not permitted as it sits within the App Group container as opposed to the app's own container (which I presume is the first 'allowed directory'). Whilst this explains the cause of the crash on beta 3 it doesn't explain why this only started occurring on beta 3. I will file a bug report as you suggest with respect to binary compatibility but any insights you might be able to provide now we've narrowed down the affected file would be much appreciated too!
Jul ’20