Symbolicating crash report pointing to some wrong method

We are developing an Xcframework that contains Objc + Cpp code. Cpp code is internal to the framework so we are stripping all the symbols.

Framework is using few build options to strip off internal symbols:

  1. Build PostProcessing - Yes
  2. Strip Style - Non-Global Symbols
  3. Other Cpp flags - $(OTHER_CFLAGS) -fvisibility-inlines-hidden -fvisibility=hidden

Now when a crash happens due to "abort()" function, the topmost symbol (of the framework) is symbolicated to totally different method. But if crash happens due to some other reason, then symbolication works fine. We have tried atos command as well as Mac symbolicator. Both show same behaviour.

To explain the issue in more details, here is a GitHub link. It contains all the details on how to reproduce the issue. Build settings are similar to our project settings.

https://github.com/bmahajanZ/apple-build-issue

Answered by DTS Engineer in 797422022

Your example crash is mapping the wrong method name due to the extremely short length of your example function, for example:

void InternalCppClass::crashAppWithAbort() {
    abort();
}

Due to how the addresses are mapped between the addresses recorded at the time of the crash and then the addresses in the dSYM symbol table, you can wind up with atos returning the address of a neighboring function in the symbol table for extremely short functions.

I'm emphasizing the function length here, as from a pragmatic point of view, this is relatively rare to encounter such a short function in real world code bases that you will run into this often. If you're inquiring because you're worried about making sure your crash reports are accurate, this really only appears in this scenario. Further, if you do wind up in this scenario, it's likely the real function you're looking for is the immediate neighbor above or below the short function in the same source code file.

We're already tracking this as a bug internally (r. 24748643), but I'd still appreciate if you could file a bug report about this with your example so we have your report here recorded. If you open that report, please post the FB number here for my reference.

If you have any questions about filing a bug report, take a look at Bug Reporting: How and Why?

— Ed Ford,  DTS Engineer

Your example crash is mapping the wrong method name due to the extremely short length of your example function, for example:

void InternalCppClass::crashAppWithAbort() {
    abort();
}

Due to how the addresses are mapped between the addresses recorded at the time of the crash and then the addresses in the dSYM symbol table, you can wind up with atos returning the address of a neighboring function in the symbol table for extremely short functions.

I'm emphasizing the function length here, as from a pragmatic point of view, this is relatively rare to encounter such a short function in real world code bases that you will run into this often. If you're inquiring because you're worried about making sure your crash reports are accurate, this really only appears in this scenario. Further, if you do wind up in this scenario, it's likely the real function you're looking for is the immediate neighbor above or below the short function in the same source code file.

We're already tracking this as a bug internally (r. 24748643), but I'd still appreciate if you could file a bug report about this with your example so we have your report here recorded. If you open that report, please post the FB number here for my reference.

If you have any questions about filing a bug report, take a look at Bug Reporting: How and Why?

— Ed Ford,  DTS Engineer

The behaviour is still the same even after increasing the length of the function (changes pushed to Github).

Here is the FBUG ID - FB14567182

With your updated test project, I'm seeing the correct backtrace. I built and ran the app as is (in Release mode) using Xcode 15.4, deployed to a device running iOS 17.4. Here's the snippet of the crash log I fetched from the device (using the button for triggering the abort scenario)

3   libsystem_c.dylib             	       0x1a6063b8c abort + 191
4   CrashTestSDK                  	       0x103248254 0x103244000 + 16980
5   CrashTestSDK                  	       0x103248a94 0x103244000 + 19092
6   CrashTest                     	       0x102cb40f0 @objc ViewController.crashAppWithAbort(_:) + 124

... 

Binary Images:
       0x103244000 -        0x10324ffff CrashTestSDK arm64  <25321678ecfa3cc28cba4f8f1fafe640> /private/var/containers/Bundle/Application/76B6A167-32E3-4146-9905-C62B1E79EFDF/CrashTest.app/Frameworks/CrashTestSDK.framework/CrashTestSDK
       0x102cb0000 -        0x102cbffff CrashTest arm64  <4569900143be3563b1da376612981bb7> /private/var/containers/Bundle/Application/76B6A167-32E3-4146-9905-C62B1E79EFDF/CrashTest.app/CrashTest

I symbolicated the log with the following two approaches:

% xcrun crashlog /Path/To/CrashTest-2024-08-06-182515.ips 
<Output for frames 0 -3 removed for the forums>

[  4] 0x0000000103248253 CrashTestSDK`InternalCppClass::crashAppWithAbort() + 75 at InternalCppClass.cpp:50:9
[  5] 0x0000000103248a93 CrashTestSDK`-[PublicHeader crashAppWithAbort] + 35 at PublicHeader.mm:16:14
[  6] 0x0000000102cb40ef CrashTest`merged @objc CrashTest.ViewController.crashAppWithAbort(Any) -> () + 123
<More output removed for other frames>

That path invokes LLDB to do some automatic symbolication and location of the dSYM file, and is generally a good approach to use.

I can do the same with atos:

% atos -arch arm64 -o /Path/To/CrashTestSDK.framework.dSYM/Contents/Resources/DWARF/CrashTestSDK -l 0x103244000 0x103248254 0x103248a94

InternalCppClass::crashAppWithBadAccess() (in CrashTestSDK) (InternalCppClass.cpp:54)
-[PublicHeader crashAppWithAbort] (in CrashTestSDK) (PublicHeader.mm:17)

However, note that the line numbers in this case don't match the LLDB output, and that the function name crashAppWithBadAccess is wrong. While I used the addresses from the crash report, notice the LLDB output references a symbol at address 0x0000000103248253, while the crash report is 0x0000000103248254. Because your abort call is at the very end of the function and due to how the system sets up frames to return, you're hitting what I described previously where the function named in the output is the neighbor to the one you're looking for. The address recorded by the stack backtrace is the address the function is next going to execute (see the crash report documentation for this caveat), so the real crash happens at the instruction before (0x0000000103248253). LLDB knows to take this into account, but with atos, you sometimes need to take that into account manually (short functions or the very last function line here), so if you deduct one from the function address and feed that address into atos you get the right answer:

% atos -arch arm64 -o /Path/To/CrashTestSDK.framework.dSYM/Contents/Resources/DWARF/CrashTestSDK -l 0x103244000 0x103248253 0x103248a93
InternalCppClass::crashAppWithAbort() (in CrashTestSDK) (InternalCppClass.cpp:50)
-[PublicHeader crashAppWithAbort] (in CrashTestSDK) (PublicHeader.mm:16)

— Ed Ford,  DTS Engineer

Symbolicating crash report pointing to some wrong method
 
 
Q