[Resolved] App crashes when launched from dock, doesn't crash when run from terminal

Greetings everyone,


I posted a question about this on SO only to be nit-picked to death, so hopefully that can be avoided here.


Background

I have written an open-source cross-platform app that uses FLTK (which should be using Cocoa under the hood) to manage multiple VNC connections. I am not using Xcode to build this, only a coding editor (Geany) and the Xcode command-line tools.


On macOS my app crashes in a certain spot when launched from the Dock while on Linux, FreeBSD and OpenIndiana the app does not crash.


I am *not* listing any code here because this is more of a conceptual question, not a coding question.


Here is the oddity: When I run this app from the terminal on macOS, it does not crash -- at all, but when launched from the Dock, it crashes at the same point nearly every time.


This is the only question I'm asking: What would cause the app to behave so differently between launching from the Dock compared to being run from the terminal? I am not passing any arguments to the app from either launch method. The app is being launched from the same user account.


Again, I am not asking for code help, and not asking why a certain call is crashing, etc. I'm only wanting to know why my app would crash when launched from the Dock yet is totally fine when launched from the Terminal.


Thank you! :-D


Will B


* macOS 10.15.2
* Mac mini 2018

Accepted Reply

[WillBProg3245 emailed me their crash report.]

Well, that’s an interesting crash you’ve got there. I had a look at your crash report and it didn’t reveal anything new, so I then started poking around in your core.

% lldb -c core.719
(lldb) target create --core "core.719"
Core file '/Users/quinn/Desktop/core.719' (x86_64) was loaded.
(lldb) thread list 
Process 0 stopped
  …
  thread #10: … libxpc.dylib`xpc_release + 6 …
  …
(lldb) thread select 10
…
(lldb) disas -f 
libxpc.dylib`xpc_release:
    0x7fff65659d9e <+0>:  testb  $0x1, %dil
    0x7fff65659da2 <+4>:  jne    0x7fff65659ddd            ; <+63>
->  0x7fff65659da4 <+6>:  movq   (%rdi), %rax
…

As you can see, the program has crashed referencing RDI at +6. So what’s in RDI:

(lldb) p/x $rdi 
(unsigned long) $0 = 0xe2160458f3753a00

Whoah, that does not look even close to a valid pointer. Heap pointers on modern versions of macOS typically look like 0x00006000_xxxxxxxx. Moreover, all pointers on macOS are typically 0x00007***_xxxxxxxx or less. 0xe2160458f3753a00 is way out of range. It’s even out of range if you rotate it by 4 bits (which you’ll commonly see in crash reports because the memory allocate does this to its free list to help track down bugs).

So, where did this come from? Well, it’s crash right at the start of

xpc_release
, meaning that RDI hasn’t been modified, meaning that it’s simply the object to be released. Clearly that’s bogus.

Now let’s pop up a level and look at the caller:

(lldb) f 1
frame #1: … libxpc.dylib`_xpc_dictionary_node_free + 62
…
(lldb) disas -f
libxpc.dylib`_xpc_dictionary_node_free:
    …
    0x7fff6565c843 <+24>: movq   0x8(%rbx), %rcx
    0x7fff6565c847 <+28>: movq   %rcx, 0x8(%rax)
    0x7fff6565c84b <+32>: movq   0x8(%rbx), %rcx
    0x7fff6565c84f <+36>: movq   %rax, (%rcx)
    0x7fff6565c852 <+39>: movq   $-0x1, %rax
    0x7fff6565c859 <+46>: movq   %rax, (%rbx)
    0x7fff6565c85c <+49>: movq   %rax, 0x8(%rbx)
    0x7fff6565c860 <+53>: movq   0x10(%rbx), %rdi
    0x7fff6565c864 <+57>: callq  0x7fff65659d9e     ; xpc_release
->  0x7fff6565c869 <+62>: movq   %rbx, %rdi

RDI comes from RBX + 0x10. In this context RBX is an internal XPC data structure used to manage a dictionary node. Unfortunately XPC isn’t part of Darwin, but the basic structure looks like this:

struct Node {               // offsets
    struct Node * next;     // 0x00
    struct Node * prev;     // 0x08
    xpc_object_t value;     // 0x10
    unsigned int type;      // 0x18
    unsigned int pad;       // 0x1c
    char key[0];            // 0x20
};

where:

  • next
    and
    prev
    are managed by the standard BSD macros in
    <sys/queue.h>
  • key
    is an unbounded array containing the dictionary node’s key
  • pad
    exists because
    key
    is actually a union, and one item of that union has to be pointer aligned

WARNING I’m discussing this structure as an aid to debugging. Do not rely on it for anything other than that. It’s not considered API.

Let’s dump the RBX structure as bytes ASCII and words.

(lldb) m read -c 64 $rbx
0x6000022205f0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  ????????????????
0x600002220600: 00 3a 75 f3 58 04 16 e2 00 00 00 00 ef bb 3d ef  .:u?X..?....?=?
0x600002220610: 44 33 43 33 30 38 43 46 2d 32 32 35 31 2d 31 41  D3C308CF-2251-1A
0x600002220620: 34 36 2d 46 42 44 37 2d 32 35 33 44 42 32 42 32  46-FBD7-253DB2B2
(lldb) m read -f x -s 8 -c 8 $rbx
0x6000022205f0: 0xffffffffffffffff 0xffffffffffffffff
0x600002220600: 0xe2160458f3753a00 0xef3dbbef00000000
0x600002220610: 0x4643383033433344 0x41312d313532322d
0x600002220620: 0x2d374442462d3634 0x3242324244333532

As you can see,

key
is a UUID, which is pretty reasonable given the context established by the backtrace. The
next
and
prev
fields are both set to
(void *) -1
, which is the
TRASHIT
value form
<sys/queue.h>
. The
type
field is 0, which is reasonable in this context.

That leaves

value
and
pad
, and both of those are very wonky. The
value
field is 0xe2160458f3753a00, which is the RDI value that triggerred the crash. And the
pad
field looks kinda similar, that is, a seemingly random sequence of bytes.

Alas, that’s about as far as I can take this in the time I have available on DevForums. It provides evidence for my initial suspicion, that this is a memory smasher of some form, but that doesn’t help you debug it. I’m actually quite surprised that ASan didn’t turn up anything else (Zombies is unlikely to help given that none of this is Objective-C or Swift).

Some questions:

  • Do you recognise the values in

    value
    (0xe2160458f3753a00) or
    pad
    (0xef3dbbef)? I’m curious if they’re anything obvious from the domain of your app.
  • When you ran with ASan, did you make sure to recompile your entire app, including any (non-system) libraries? When working with ASan, you want it to cover as much as possible, and that means you have to recompile everything with ASan enabled (it’s a shame that there’s no way to enable it for the system frameworks).

  • The other tool worth trying is

    libgmalloc
    . This is not nearly as cool as ASan, but it has one key advantage: It applies to all code in your process, including the system frameworks,

Share and Enjoy

Quinn “The Eskimo!”
Apple Developer Relations, Developer Technical Support, Core OS/Hardware

let myEmail = "eskimo" + "1" + "@apple.com"

Replies

This is the best example I can find: support.shotgunsoftware.com/hc/en-us/articles/219042108-Setting-global-environment-variables-on-OS-X

There was a very well-hidden 'use-after-free' that libgmalloc brought to light.

libgmalloc
FTW!

Oh, and just FYI…

how do you set the environment variables for GUI apps launched from the dock?

In cases like this, where it’s easy to modify the app itself, I generally set the

LSEnvironment
property in my
Info.plist
.

Share and Enjoy

Quinn “The Eskimo!”
Apple Developer Relations, Developer Technical Support, Core OS/Hardware

let myEmail = "eskimo" + "1" + "@apple.com"