Mach Port Leakage - Mojave

An application I am working on (Mostly swift 4.2 with sprinklings of C and C++) appears to be leaking mach ports when run on Mojave and currently has 3,600 open after running for 12 hours. I am just looking at the 'Ports' field in Activity Monitor to see this.


The same binary running on Sierra is using about 500 ports.


The application is quite complex, making extensive use of DispatchQueue() and also has an associated privileged helper.


Unfortunately, I have been unable to create a simple test case that demonstrates the behaviour.


I have done a lot of searching to find out what may be causing the problem and come up with zilch. Is there any debugging tool (i.e. Instruments plugin maybe) that might help to narrow down where the allocation/leakage is coming from. I have seen the results of the tests on google Chrome from a few years back but I'm not keen to start building, installing and debugging kernel extensions to track this problem down.

Replies

Well, you might find the culprit using the System Call Trace instrument of Instruments. You can look at the Mach system calls for allocating and deallocating ports and the call trees that are responsible for most such calls. If something is prominent in the allocation call trees but not in the deallocation call trees that might help narrow it down.


In addition to syscalls which explicitly allocate Mach ports, there are some which do so implicitly, such as mach_host_self(), mach_thread_self(), task_for_pid(), etc.


Unfortunately, it's also possible to receive Mach ports with mach_msg() and that will be hard to track. Plus, it's conceivable that the kernel or drivers could inject ports into your process in response to some seemingly-unrelated call.


It's conceivable that you could find a small number of routines in the kernel which are responsible for allocating and deallocating ports (regardless of the higher-level mechanism) and use the DTrace "fbt" provider to trace the user stack for those.

What Ken Thomases said and…

You should also take a look at the MachPortDump sample code. You can use this to help isolate the problem, as described in the Bug Hunting with MachPortDump section of the read me.

Share and Enjoy

Quinn “The Eskimo!”
Apple Developer Relations, Developer Technical Support, Core OS/Hardware

let myEmail = "eskimo" + "1" + "@apple.com"

Thanks Quinn and Ken


On further observation the port usage seems to peak at about 3600 and doesn't seem to get much more. I found that just opening a window adds about 300 ports. On closing the recently opened window, the 300 additional ports are eventually deallocated. It looks as though there is some kind of garbage collection mechanism performing the deallocations as it takes a couple of minutes after closing the window before the port count drops to its previous level.


I tried Ken's suggestion of using the 'System Call' Instruments module. Interesting results but no smoking gun. I did spot another (unrelated to port usage) problem in the results - one that had not been obvious in my usual 'Timing' and 'Leaks' tests.


MachPortDump gives me a list of all the allocated ports but most of them have a 'Send' count of 1 and no associated rights that I can see. I have no way to see which higher level function was responsible for the allocation. As noted above - its not clear that these are leaks, but rather just normal usage.


I'll leave this for now - it is high usage and not a leak, but quite different behaviour in Mojave compared to Sierra (and High Sierra).


Thanks for the help

Bryan

MachPortDump gives me a list of all the allocated ports but most of them have a 'Send' count of 1 and no associated rights that I can see.

Right. That just means that the receive right is held by some other process. There is, alas, no easy way to work out what that process is.

I have no way to see which higher level function was responsible for the allocation.

Indeed there’s no way to do this directly. You can, however, do this indirectly via the techniques outlined in the read me. Whether this is worth doing in your case is for you to decide. I will note, however, that 3600 ports does seem pretty high. For example, this command:

$ top -pid `pgrep Finder`

shows my Finder is currently using about 800 ports.

Share and Enjoy

Quinn “The Eskimo!”
Apple Developer Relations, Developer Technical Support, Core OS/Hardware

let myEmail = "eskimo" + "1" + "@apple.com"

I went through a process of elimination in my application trying to find the what is triggering the port usage. In the end I have found I can reproduce the problem by populating an NSTableView. If I clear the table (i.e. empty the data source array) then the ports are closed. As the number of rows increases, so does the number of used ports.


The table in question has no more than 100 rows, is displaying 6 columns and is updated every 4 seconds.


In my code I clear the array of table data, create a new array and if there are no errors, reassign the table data and then refresh the view.


I commented out the one line of code that assigns the table data, but still went through all the steps of actually building the array - i.e. all the code was executed that puts values into the the table data array, thus eliminating dns resolution etc as the cause of the port allocations. The port count remained stable at about 480. I think this proves that its the table refresh actually using all the ports.


Next step will be to try to build a simple test case and submit a bug report.

I'm guess this has to do with the change to layer backing of views in Mojave. A table view has many subviews. They may be using lots of layers. Under the hood, layers may use IOAccel surfaces or something like that, and those may entail additional Mach ports.


If you build the same app against the 10.13 or earlier SDK and then run it on Mojave, does it exhibit the same high Mach port usage? Apps built for pre-10.14 SDKs should not get the new layer backing behavior.

Not so simple to build against the 10.13 SDK. The code is mostly Swift 4. Xcode 10 refuses to compile Swift 4 against anything other than the 10.14 SDK.


Anyway - I made a test app and am observing 1 mach port per table cell. I don't know if this is intended behaviour or a bug but I am leaning towards intended. The ports are there to be used and as long as the usage is constrained by the table size, then the only problem is seeing a big number in the Ports field of Activity Monitor. I'll hold off on submitting a bug report unless y'all think it's worthwhile.

A Mach port per cell seems rather excessive to me. Given that you’ve already done all the work to boil this down into a test app, I think it would be a good idea to file a bug about it. It may be that this is working as designed, but perhaps that design could be improved.

Please post your bug number, just for the record.

Share and Enjoy

Quinn “The Eskimo!”
Apple Developer Relations, Developer Technical Support, Core OS/Hardware

let myEmail = "eskimo" + "1" + "@apple.com"

BugReporter #46171921 (NSTableView has very high mach port usage)

I was just reminded that there’s an equivalent of MachPortDump included in the OS, namely,

lsmp
. There’s no man page but its usage is helpful:
$ lsmp -h
Usage: lsmp -p <pid> [-a|-v|-h] 
Lists information about mach ports. Please see man page for description of each column.
    -p <pid> :  print all mach ports for process id <pid>. 
    -a :  print all mach ports for all processeses. 
    -v :  print verbose details for kernel objects.
    -j <path> :  save output as JSON to <path>.
    -h :  print this help.

MachPortDump is still useful if you want to write code to automate this sort of thing, but if you’re investigating from the command line then

lsmp
is much better (built in to macOS, better output, updated as the system evolves).

Share and Enjoy

Quinn “The Eskimo!”
Apple Developer Relations, Developer Technical Support, Core OS/Hardware

let myEmail = "eskimo" + "1" + "@apple.com"
Add a Comment

Thanks Quinn


I see lsmp offers a a few additional details not shown by MachPortDump.



In my test application I see several hundred ports that have type NAMED-MEMORY. These seem to be the ports allocated by NSTableView as the number of goes up and down according to the number of the rows (cells) in the table.



In my real application, I am also seeing quite a few ports labelled 'Unknown Process'. I am guessing these are artefacts of analysing the ports related to child processes that have terminated after the port was collected from the OS. However, there are a lot of these and the application is not executing that many external processes. I am not seeing any zombies so there's another mystery.


Bryan

I have also isolated a Mach port leak in Process.run() and submitted a test application to bugreporter.

BugReporter #46547156


The test program I attached to the bug calls Process.run() to execute 'ls -l /' 30 times and clearly shows the increasing Mach port count.


I also built (and included with my test program) a dynlib that when preloaded against an executable, intercepts calls to mach_port_construct(), mach_port_destruct(), mach_port_allocate(), mach_port_deallocate() etc.


In the 'hooked' version of the allocation routines I capture the current callstack and save it to a std::map keyed on the port name.


At program exit, I stole the allocated port iteration code from MachPortDump (thanks Quinn) to print the callstack of all allocated mach ports.


There are a few leaked allocations that I can't account for, presumably because they were allocated from code that wasn't intercepted by my library, (maybe I am missing some hooks) but the ports associated with Process.run() were clearly present.

BugReporter #46171921 (NSTableView has very high mach port usage)

BugReporter #46547156

Hey hey, we give you a new tool and you totally go to town with it! (-:

Thanks for filing both of these bugs.

There are a few leaked allocations that I can't account for, presumably because they were allocated from code that wasn't intercepted by my library, (maybe I am missing some hooks

One of the difficulties in tracking Mach port leaks is that the kernel can allocate ports in your app without running any code in your process. Specifically, if you receive a Mach message with an attached port right, the act of receiving that message causes the kernel to allocate a name for that port in your process (if it doesn’t already exist) and then increment the ref count on the corresponding right on that port.

Mach messaging is crazy complicated and it’s very easy to make mistakes that lead to port leaks, and then very hard to debug those mistakes. That’s the main reason why we encourage developers to not use Mach messaging directly, but instead lean on higher-level APIs like XPC. Alas, sometimes it’s not possible for our framework code to follow that advice )-:

Share and Enjoy

Quinn “The Eskimo!”
Apple Developer Relations, Developer Technical Support, Core OS/Hardware

let myEmail = "eskimo" + "1" + "@apple.com"

Thanks for the comment Quinn.


<quote>One of the difficulties in tracking Mach port leaks is that the kernel can allocate ports in your app without running any code in your process.</quote>


I thought that was the case but was relieved to see the leak that I suspected to be present, was actually being captured.


It would be kind of neat if someone who knows a lot more about mach ports than me (hint hint) were to fix up my hook library and make it more generally available, with a better method of attaching it to the target executable. Its all kind of clunky at the moment, but beats the heck out of trying to do the same thing with a kext 🙂.

So, today my Finder crashed with


Crashed Thread:        10  Dispatch queue: sync queue: vRefNum = -100(boot)  Exception Type:        EXC_BAD_INSTRUCTION (SIGILL) Exception Codes:       0x0000000000000001, 0x0000000000000000 Exception Note:        EXC_CORPSE_NOTIFY  Termination Signal:    Illegal instruction: 4 Termination Reason:    Namespace SIGNAL, Code 0x4 Terminating Process:   exc handler [5323]  Application Specific Information: dyld3 mode *** The system has no mach ports available. You may be able to diagnose which application(s) are using ports by using 'top' or Activity Monitor. (3) ***

Looking at Activity Monitor, I see Finder (since relaunching a few hours ago) has 29,500 ports open (this on macOS 10.14.6), Xcode has 7,591, WindowServer 35,323, and most apps have under 1000. You say 3600 seems pretty high. I've been noticing my Finder SPODs a lot lately, too, even prior to updating from 10.14.5 to 10.14.6.