Post

Replies

Boosts

Views

Activity

Reply to Crash starting app after upgrade to Monterey
Hello, I cannot debug more, because I would need to compile a debuggable version of libmalloc. My findings so far: I enter in https://opensource.apple.com/source/libmalloc/libmalloc-317.140.5/src/malloc.c.auto.html free Then, it enters in find_registered_zone When inside find_registered_zone, I lose myself. I know that tcmalloc library is called because I can see that the malloc library asks the size (which is translated in the call to tcmalloc::mz_size). Then, tcmalloc (here: https://github.com/gperftools/gperftools/blob/master/src/libc_override_osx.h#L108) correctly calculates the size and returns 128 When back to the malloc library, I see that more code on find_registered_zone is executed, but at some point, instead of the tcmalloc zone, the default zone is returned and then free_definite_size is called on that. But... default_zone is the tc_malloc one, which doesn't have free_definite_size, and crash! ... (lldb) thread step-inst Process 95351 stopped * thread #1, queue = 'com.apple.main-thread', stop reason = instruction step into frame #0: 0x00007ff81bb921f1 libsystem_malloc.dylib`free + 337 libsystem_malloc.dylib`free: -> 0x7ff81bb921f1 <+337>: popq %rbp 0x7ff81bb921f2 <+338>: jmpq *%rax 0x7ff81bb921f4 <+340>: xorl %edi, %edi 0x7ff81bb921f6 <+342>: testl $0x140, 0x419face0(%rip) ; malloc_num_zones_allocated, imm = 0x140 Target 0: (crash) stopped. (lldb) thread step-inst Process 95351 stopped * thread #1, queue = 'com.apple.main-thread', stop reason = instruction step into frame #0: 0x00007ff81bb921f2 libsystem_malloc.dylib`free + 338 libsystem_malloc.dylib`free: -> 0x7ff81bb921f2 <+338>: jmpq *%rax 0x7ff81bb921f4 <+340>: xorl %edi, %edi 0x7ff81bb921f6 <+342>: testl $0x140, 0x419face0(%rip) ; malloc_num_zones_allocated, imm = 0x140 0x7ff81bb92200 <+352>: sete %dil Target 0: (crash) stopped. (lldb) register read rax --format dec rax = 140703593735262 libsystem_malloc.dylib`default_zone_free_definite_size (lldb) thread step-inst Process 95351 stopped * thread #1, queue = 'com.apple.main-thread', stop reason = instruction step into frame #0: 0x00007ff81bb9245e libsystem_malloc.dylib`default_zone_free_definite_size libsystem_malloc.dylib`default_zone_free_definite_size: -> 0x7ff81bb9245e <+0>: movq 0x41a16d63(%rip), %rdi ; lite_zone 0x7ff81bb92465 <+7>: testq %rdi, %rdi 0x7ff81bb92468 <+10>: jne 0x7ff81bb92474 ; <+22> 0x7ff81bb9246a <+12>: movq 0x419fa8b7(%rip), %rax ; malloc_zones Target 0: (crash) stopped. ... From tcmalloc's side, everything is returned as it should (128 for the size), and it recognizes the pointer as owned. How can I go deeper? Thanks!
Nov ’21
Reply to Crash starting app after upgrade to Monterey
Is there something else that may interfere? Because I have the same disassembly result, however, I have the crash: (lldb) r Process 8502 launched: '/Users/aple/workspace/gperftools/a.out' (x86_64) malloc(128) returned 0x100604000 Process 8502 stopped * thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x0) frame #0: 0x0000000000000000 error: memory read failed for 0x0 Target 0: (a.out) stopped. (lldb) disas -n main a.out`main: ... 0x100003f26 <+22>: movl $0x80, %edi 0x100003f2b <+27>: callq 0x100003f5e ; symbol stub for: malloc 0x100003f30 <+32>: movq %rax, -0x18(%rbp) ... 0x100003f46 <+54>: movq -0x18(%rbp), %rdi 0x100003f4a <+58>: callq 0x100003f58 ; symbol stub for: free 0x100003f4f <+63>: xorl %eax, %eax ... (lldb) bt * thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x0) * frame #0: 0x0000000000000000 frame #1: 0x0000000100003f4f a.out`main + 63 frame #2: 0x00000001000154fe dyld`start + 462 (lldb) I am linking with -ltcmalloc. If I patch tcmalloc by providing a custom free_definite_size (which simply calls tcmalloc's free), then compile and link again the simple PoC, the segfault goes away. Unfortunately, I cannot go deeper with the debugger (i.e., I cannot see what is happening inside the called free). Thanks!
Nov ’21
Reply to Crash starting app after upgrade to Monterey
Hello! After many iterations, we think we arrived at the heart of the issue, but we need assistance to dig it further (@eskimo, thanks for your help btw!). TL;DR: apple-clang replaces calls to free(x) (where x was allocated with a fixed-size at compile-time) with free_definite_size(x). The problem is that, by definition in malloc.h, the callback free_definite_size can be NULL (and it is correctly checked in malloc.c: https://opensource.apple.com/source/libmalloc/libmalloc-317.140.5/src/malloc.c.auto.html, function void free(void* ptr)). tcmalloc doesn't implement free_definite_size: https://github.com/gperftools/gperftools/blob/master/src/libc_override_osx.h#L275 . Any program which does allocation/deallocation of fixed size, for example: #include <stdio.h> #include <stdlib.h> int main(int argc, char *argv[]) { void *ptr = malloc(128); printf("malloc(128) returned %p\n",ptr); free(ptr); /* replaced by free_definite_size(ptr, 128); */ return 0; } It will crash with tcmalloc (because of the lack of the callback for free_definite_size) but it runs fine without it. Solution: we can provide patches to tcmalloc (even if the maintainer doesn't answer), but I think either clang should be fixed, or the documentation updated with an error at compile time when registering the zone. Can someone help me to escalate this to Apple clang developers? Thanks!
Nov ’21