NEDNSProxyProvider closes socket unexpectedly

My NEDNSProxyProvider subclass basically does this for each incoming UDPFLow:


open()

while moreDataComing {

read()

filter() // modify EDNS0

write()

}

closeWrites()

closeReads()


What I sometimes see though is that after the first read/write sequence, iOS appears to close the UDPFlow for me. I see these in the logs:


default 16:12:09.186963 -0800 MyProxy (2552644817): Closing reads, not closed by plugin

default 16:12:09.187134 -0800 MyProxy (2552644817): Closing writes, not sending close


Then later when I attempt to close the flow explicitly I get:


default 16:12:09.273912 -0800 MyProxy writeDatagrams finished with error: Optional(Error Domain=NEAppProxyFlowErrorDomain Code=1 "The operation could not be completed because the flow is not connected" UserInfo={NSLocalizedDescription=The operation could not be completed because the flow is not connected})


Sometimes I also see


error 17:01:55.396243 -0800 MyProxy (3009813469): flow is closed for writes, cannot write 111 bytes of data


I'm not 100% sure if this is actually a problem or not as far as the actual functioning of the proxy goes, but I'd like to understand why it's happening. Is there something I'm doing or not doing that's causing iOS (or CFNetwork or whatever) to close the socket before I can do so explicitly? My class has a strong reference to the NEAppProxyUDPFlow object so it's not like it's getting deallocated early or anything.

Replies

I’d like to clarify some aspects of your pseudo code:

  • As written, it looks like you’re doing this all synchronously. That’s quite unusual, because the I/O APIs are asynchronous. Is this just because you’re writing pseudo code? Or do you have some sort of ‘synthetic synchronous’ thing going on?

  • In this code, is

    read()
    reading from an
    NEAppProxyUDPFlow
    ? And is
    write()
    writing to an
    NEAppProxyUDPFlow
    ? If so, have you just elided the code that sends the request out on the wire?

Share and Enjoy

Quinn “The Eskimo!”
Apple Developer Relations, Developer Technical Support, Core OS/Hardware

let myEmail = "eskimo" + "1" + "@apple.com"

Hi Quinn, thanks for answering and sorry for the delay.


> As written, it looks like you’re doing this all synchronously. That’s quite unusual, because the I/O APIs are asynchronous. Is this just because you’re writing pseudo code? Or do you have some sort of ‘synthetic synchronous’ thing going on?


The actual code isn't synchronous because it uses all the block-based APIs from `NEAppProxyUDPFlow` which are inherently async. For example, the `open()` call from the pseudo code actually looks like this:



    private func open() {
        // The endpoint is ignored. From the NEAppProxyFlow.h header: "If the source application already specifed a local endpoint by binding the socket then this parameter is ignored."
        let bogusEndpoint = NWHostEndpoint(hostname: "0.0.0.0", port: "0")
        udpFlow.open(withLocalEndpoint: bogusEndpoint) { [unowned self] (error) in
            Logger.fileLog("\(self.logIdentifier) - opened with error: \(error.debugDescription)")
            if let error = error {
                Logger.fileLog("*** \(self.logIdentifier) - error: \(error)")
            } else {
                self.handleMoreData()
            }
        }
    }


You can see the entire project in the radar I filed: rdar://48495244 .


> In this code, is read() reading from an NEAppProxyUDPFlow? And is write() writing to an NEAppProxyUDPFlow? If so, have you just elided the code that sends the request out on the wire?


Yes, read() is in fact `NEAppProxyUDPFlow.readDatagrams` and write() is `NEAppProxyUDPFlow.writeDatagrams`. I'm not sure about your second question but here's some more detail.


For each incoming `NEAppProxyUDPFlow`, we:


while(!done)

udpFlow.readDatagrams()

Add some provisioning info to the EDNS0 part of each datagram

Send the DNS request to our own DNS server

Take the response data from that request and feed it to

udpFlow.writeDatagrams()



And continue that sequence until udpFlow.readDatagrams() returns 0 datagrams (or an error). Here's the actual code from that loop:



    private func handleMoreData() {
        udpFlow.readDatagrams() { [unowned self] (datagrams, endpoints, error) in
            if let error = error {
                self.finish(error: error, logStr: "*** Error: \(error). Closing flow for read and write.")
                return
            }
            if let datagrams = datagrams, let endpoints = endpoints, !datagrams.isEmpty {
                let modifiedData = self.addEDNS0(datagrams)
                if modifiedData.isEmpty {
                    self.finish(error: nil, logStr: "*** Modified data was empty. Closing flow and bailing.")
                    return
                }
               
                let responseData = self.send(modifiedData)
                if responseData.isEmpty {
                    self.finish(error: nil, logStr: "*** Response data was empty. Closing flow and bailing.")
                    return
                }
               
                self.writeDNSResponse(responseData, endpoints) { (error) in
                    if let error = error {
                        self.finish(error: error, logStr: "*** Write failed")
                    } else {
                        self.handleMoreData()
                    }
                }
            } else {
                self.finish(error: nil, logStr: "No more data. Closing flow.")
            }
        }
    }


Where we see the error (and this happens about 50% of the time) is in `writeDNSResponse` which looks like:



    private func writeDNSResponse(_ datagrams: [Data], _ endpoints: [NWEndpoint], completion: @escaping(_ error: Error?)->Void) {
        udpFlow.writeDatagrams(datagrams, sentBy: endpoints) { (error) in
            if let error = error {
                Logger.fileLog("*** \(self.logIdentifier) - error: \(error)")
            }
            completion(error)
        }
    }

Does that help pinpoint the problem at all?

Oh another thing that occurred to me, is it a problem that my proxy itself hits the network? Could I be hitting some sort of inception situation where the proxy is intercepting its own calls?

Oh another thing that occurred to me, is it a problem that my proxy itself hits the network?

No. It’s fully expected that a DNS proxy provider will use the network.

The actual code isn't synchronous because it uses all the block-based APIs from

NEAppProxyUDPFlow
which are inherently async.

OK, but there still seems to be synchronous code in there, specifically this line:

let responseData = self.send(modifiedData)

Presumably

send(_:)
hits the network, and it must do that synchronously for it return the response as a function result. Right?

Share and Enjoy

Quinn “The Eskimo!”
Apple Developer Relations, Developer Technical Support, Core OS/Hardware

let myEmail = "eskimo" + "1" + "@apple.com"

Yes, that's true— that's a synchronous network call. If you think that could be the cause then I'll work to make it async.

If you think that could be the cause then I'll work to make it async.

Yeah, that’d be a good idea. I can’t guarantee that’ll fix things, but your current approach is definitely worrying.

Remember that each flow represents a serialisation context. For example, if you do this:

udpFlow.readDatagrams() { … in 
    // block A
    … your code …
}
udpFlow.readDatagrams() { … in 
    // block B
    … your code …
}

the flow guarantees that block A and B won’t run concurrently.

The flow is using some sort of mutex (I believe it’s a Dispatch serial queue, but I haven’t looked at the code) to guarantee this. If you block for long periods of time in your callback (blocks A and B in this example), you are effectively holding the mutex for the duration. I don’t know what else is serialised by that mutex, but it’s easy imagine this causing other parts of the flow to behave strangely.

NetworkExtension providers were designed to operate along standard Dispatch lines, that is, you shouldn’t block waiting for I/O in your callbacks. It’s OK to bend those rules in some circumstances — when you take a page fault, you end up waiting for disk I/O — but network I/O can take a long time and you don’t want to block the queue for that long. For example, if your

send(_:)
method applies the standard DNS timeout of 30 seconds, that’d be bad.

Share and Enjoy

Quinn “The Eskimo!”
Apple Developer Relations, Developer Technical Support, Core OS/Hardware

let myEmail = "eskimo" + "1" + "@apple.com"
Hey @scalo and @eskimo, this might seem a bit off topic from the question but am trying to do something similar. I am trying to figure out a way to send the data received with udpFlow.readDatagrams() to a custom DNS server. Could you please tell me how you have implemented the addEDNS0(datagrams) and the send(modifiedData) functions. That'll be really helpful. Thanks!

[I] am trying to do something similar.

I see that you created a new thread for this and Matt has responded there.

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"