[WillBProg3245 emailed me their crash report.]
Well, that’s an interesting crash you’ve got there. I had a look at your crash report and it didn’t reveal anything new, so I then started poking around in your core.
% lldb -c core.719
(lldb) target create --core "core.719"
Core file '/Users/quinn/Desktop/core.719' (x86_64) was loaded.
(lldb) thread list
Process 0 stopped
…
thread #10: … libxpc.dylib`xpc_release + 6 …
…
(lldb) thread select 10
…
(lldb) disas -f
libxpc.dylib`xpc_release:
0x7fff65659d9e <+0>: testb $0x1, %dil
0x7fff65659da2 <+4>: jne 0x7fff65659ddd ; <+63>
-> 0x7fff65659da4 <+6>: movq (%rdi), %rax
…
As you can see, the program has crashed referencing RDI at +6. So what’s in RDI:
(lldb) p/x $rdi
(unsigned long) $0 = 0xe2160458f3753a00
Whoah, that does not look even close to a valid pointer. Heap pointers on modern versions of macOS typically look like 0x00006000_xxxxxxxx. Moreover, all pointers on macOS are typically 0x00007***_xxxxxxxx or less. 0xe2160458f3753a00 is way out of range. It’s even out of range if you rotate it by 4 bits (which you’ll commonly see in crash reports because the memory allocate does this to its free list to help track down bugs).
So, where did this come from? Well, it’s crash right at the start of
xpc_release
, meaning that RDI hasn’t been modified, meaning that it’s simply the object to be released. Clearly that’s bogus.
Now let’s pop up a level and look at the caller:
(lldb) f 1
frame #1: … libxpc.dylib`_xpc_dictionary_node_free + 62
…
(lldb) disas -f
libxpc.dylib`_xpc_dictionary_node_free:
…
0x7fff6565c843 <+24>: movq 0x8(%rbx), %rcx
0x7fff6565c847 <+28>: movq %rcx, 0x8(%rax)
0x7fff6565c84b <+32>: movq 0x8(%rbx), %rcx
0x7fff6565c84f <+36>: movq %rax, (%rcx)
0x7fff6565c852 <+39>: movq $-0x1, %rax
0x7fff6565c859 <+46>: movq %rax, (%rbx)
0x7fff6565c85c <+49>: movq %rax, 0x8(%rbx)
0x7fff6565c860 <+53>: movq 0x10(%rbx), %rdi
0x7fff6565c864 <+57>: callq 0x7fff65659d9e ; xpc_release
-> 0x7fff6565c869 <+62>: movq %rbx, %rdi
RDI comes from RBX + 0x10. In this context RBX is an internal XPC data structure used to manage a dictionary node. Unfortunately XPC isn’t part of Darwin, but the basic structure looks like this:
struct Node { // offsets
struct Node * next; // 0x00
struct Node * prev; // 0x08
xpc_object_t value; // 0x10
unsigned int type; // 0x18
unsigned int pad; // 0x1c
char key[0]; // 0x20
};
where:
next
and prev
are managed by the standard BSD macros in <sys/queue.h>
key
is an unbounded array containing the dictionary node’s keypad
exists because key
is actually a union, and one item of that union has to be pointer aligned
WARNING I’m discussing this structure as an aid to debugging. Do not rely on it for anything other than that. It’s not considered API.
Let’s dump the RBX structure as bytes ASCII and words.
(lldb) m read -c 64 $rbx
0x6000022205f0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ????????????????
0x600002220600: 00 3a 75 f3 58 04 16 e2 00 00 00 00 ef bb 3d ef .:u?X..?....?=?
0x600002220610: 44 33 43 33 30 38 43 46 2d 32 32 35 31 2d 31 41 D3C308CF-2251-1A
0x600002220620: 34 36 2d 46 42 44 37 2d 32 35 33 44 42 32 42 32 46-FBD7-253DB2B2
(lldb) m read -f x -s 8 -c 8 $rbx
0x6000022205f0: 0xffffffffffffffff 0xffffffffffffffff
0x600002220600: 0xe2160458f3753a00 0xef3dbbef00000000
0x600002220610: 0x4643383033433344 0x41312d313532322d
0x600002220620: 0x2d374442462d3634 0x3242324244333532
As you can see,
key
is a UUID, which is pretty reasonable given the context established by the backtrace. The
next
and
prev
fields are both set to
(void *) -1
, which is the
TRASHIT
value form
<sys/queue.h>
. The
type
field is 0, which is reasonable in this context.
That leaves
value
and
pad
, and both of those are very wonky. The
value
field is 0xe2160458f3753a00, which is the RDI value that triggerred the crash. And the
pad
field looks kinda similar, that is, a seemingly random sequence of bytes.
Alas, that’s about as far as I can take this in the time I have available on DevForums. It provides evidence for my initial suspicion, that this is a memory smasher of some form, but that doesn’t help you debug it. I’m actually quite surprised that ASan didn’t turn up anything else (Zombies is unlikely to help given that none of this is Objective-C or Swift).
Some questions:
Do you recognise the values in
value
(0xe2160458f3753a00) or pad
(0xef3dbbef)? I’m curious if they’re anything obvious from the domain of your app.When you ran with ASan, did you make sure to recompile your entire app, including any (non-system) libraries? When working with ASan, you want it to cover as much as possible, and that means you have to recompile everything with ASan enabled (it’s a shame that there’s no way to enable it for the system frameworks).
The other tool worth trying is
libgmalloc
. This is not nearly as cool as ASan, but it has one key advantage: It applies to all code in your process, including the system frameworks,
Share and Enjoy
—
Quinn “The Eskimo!”
Apple Developer Relations, Developer Technical Support, Core OS/Hardware
let myEmail = "eskimo" + "1" + "@apple.com"