Transparent Proxy Provider, UDP, mbufs, and inevitable panics

First, for the  employees reading, I filed FB14844573 over the weekend, because this is a reproducible panic or hang. whee

I ran our stress tests for an entire long weekend, and my machine panicked, due to mbufs. Normally, I tell my coworkers that we can't really do anything to cause a panic -- but we're doing network things, so this is an exception. I started periodically testing the mbufs while the tests were running -- netstat -m | grep 'mbufs in use' -- and noticed that in fact they were going up, and never decreasing. Even if I killed our code and uninstalled the extensions. (They're increasing at about ~4mbufs/sec.)

Today I confirmed that this only happens if we include UDP packets:

        let udpRule = NENetworkRule(destinationNetwork: host, prefix: 0, protocol: .UDP)
        let tcpRule = NENetworkRule(destinationNetwork: host, prefix: 0, protocol: .TCP)
		...
        settings.includedNetworkRules = [udpRule, tcpRule]

If I comment out that udpRule, part, mbufs don't leak.

Our handleNewUDPFlow(:, initialRemoteEndpoint:) method checks to see if the application is a friendly one, and if so it returns false. If it isn't friendly, we want to block QUIC packets:

        if let host = endpoint as? NWHostEndpoint {
            if host.port == "80" || host.port == "443" {
                // We need to open it and then close it                                           
                flow.open(withLocalEndpoint: nil) { error in
                    Self.workQueue.asyncAfter(deadline: .now() + 0.01) {
                        let err = error ?? POSIXError(POSIXErrorCode.ECONNABORTED)
                        flow.closeReadWithError(err)
                        flow.closeWriteWithError(err)
                    }
                }
                return true
            }
		}
		return false

Has anyone else run into this? I can't see that it's my problem at that point, since the only thing we do with UDP flows is to either say "we don't want it, you handle it" or "ok sure, we'll take it but then let's close it immediately".

Ok. I just took my simple extension test program (which sets up a TPP that returns false from both handleNew*Flow methods) and it exhibits the same behaviour. Again, only with UDP.

I filed FB14844573 over the weekend

Thanks. I think that’s the best path forward for this issue.

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

Well, that's the best way for Apple. It doesn't help me at all, although I guess there's no point to filing a TSI. 😄

My question at the beginning was whether anyone else has run into this, and if they have, did they have any mitigations (other than not including UDP in the includedRules set). I'd still love to know if anyone has, especially to the mitigations part. 😄

Despite no feedback response, the issue seems to be resolved as of 15.1 or later -- we ran out tests for 55 hours with no insane mbuf leak, and, of course, no panic.

Transparent Proxy Provider, UDP, mbufs, and inevitable panics
 
 
Q