A server with the specified hostname could not be found exception

Hi, I have been working on the app that implements DNS Proxy Extension for a while now, and after a couple builds to TestFlight I noticed that I got a couple crashes that seem to be triggered by EXC_BREAKPOINT (SIGTRAP)

After some investigation, it was found that crashes are connected to CFNetwork framework. So, I decided to additionally look into memory issues, but I found the app has no obvious memory leaks, no memory regression (within recommended 25%, actual value is at 20% as of right now), but the app still uses 11mb of memory footprint and most of it (6.5 mb is Swift metadata).

At this point, not sure what's triggering those crashes, but I noticed that sometimes app will return message like this to the console (this example is for PostHog api that I use in the app):

Task <0ABDCF4A-9653-4583-9150-EC11D852CA9E>.<1> finished with error [18 446 744 073 709 550 613] Error Domain=NSURLErrorDomain Code=-1003 "A server with the specified hostname could not be found." UserInfo={_kCFStreamErrorCodeKey=8, NSUnderlyingError=0x1072df0f0 {Error Domain=kCFErrorDomainCFNetwork Code=-1003 "(null)" UserInfo={_kCFStreamErrorDomainKey=12, _kCFStreamErrorCodeKey=8, _NSURLErrorNWResolutionReportKey=Resolved 0 endpoints in 2ms using unknown from cache, _NSURLErrorNWPathKey=satisfied (Path is satisfied), interface: en0[802.11], ipv4, dns, uses wifi}}, _NSURLErrorFailingURLSessionTaskErrorKey=LocalUploadTask <0ABDCF4A-9653-4583-9150-EC11D852CA9E>.<1>, _NSURLErrorRelatedURLSessionTaskErrorKey=(
    "LocalUploadTask <0ABDCF4A-9653-4583-9150-EC11D852CA9E>.<1>"
), NSLocalizedDescription=A server with the specified hostname could not be found., NSErrorFailingURLStringKey=https://us.i.posthog.com/batch, NSErrorFailingURLKey=https://us.i.posthog.com/batch, _kCFStreamErrorDomainKey=12}

If DNS Proxy Provider uses custom DoH server for resolving packets, could the cache policy for URLSession be a reason?

I had a couple other ideas (HTTP3 failure, CFNetwork core issues like described here) but not sure if they are valid

Would be grateful if someone could give me a hint of what I should look at

Answered by DTS Engineer in 806531022
In my case, Filter Control Provider writes data received from MDM configuration profile, then Filter Data Provider reads this data to use it for flow filtering.

OK. That should be possible by putting the data into an app group. The control provider will have read/write access to that app group; the data provider will only be able to read it.

And, yes, you will need some sort of concurrency control there (-:

But my Filter Data Provider also writes some data about intercepted flow, that is later used for resolving them.

As long as this only needs to be read back by the data provider, you’re all good. Just put the data into the data provider’s container.

You still might need concurrency control though, although it’s only intra-process concurrency control. That is, multiple threads within the data provider might be accessing this data and you have to make sure they don’t stomp on each other.

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

after a couple builds to TestFlight I noticed that I got a couple crashes that seem to be triggered by EXC_BREAKPOINT (SIGTRAP)

This usually means you’ve hit a trap. It’s common to see this in Swift code — for example, if you access an array out of bounds or force unwrap an optional that’s nil — but it can also be triggered by non-Swift code, including system frameworks.

After some investigation, it was found that crashes are connected to CFNetwork framework. So, I decided to additionally look into memory issues …

Why did you decide to do that? Did you have specific evidence that your CFNetwork issue was memory related?

Can you post a crash report showing this trap exception? See Posting a Crash Report for advice on how to do that.

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

Hi, thank you for your response.

Here are two examples of the crash reports for EXC_BREAKPOINT (SIGTRAP)

Why did you decide to do that? Did you have specific evidence that your CFNetwork issue was memory related?

No, I didn't have specific evidence, however earlier I had crashes with EXC_BAD_ACCESS (SIGKILL) errors referencing to PAC. So I thought it'd a good idea to investigate possible memory issues. Additionally, some of the crashes were resolved by rewriting part of the networking module for DNS Proxy with async/await instead of completion handlers with Result.

Thanks for the crash reports.

Both of those indicate memory corruption, not memory exhaustion. Specifically:

  • In the first you’re trapping in __CFCheckCFInfoPACSignature, indicating a pointer authentication check failed.

  • In the second you’re trapping in _xzm_xzone_malloc_tiny_outlined, which is because it’s detected borkage the malloc data structures.

In short, I think you have a memory management bug in your code, and I recommend that your apply the standard memory debugging tools.

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

I see now. Quick question, could it be related to overuse of UserDefaults?

The reason I am asking is because some of the temporary data is stored in UserDefaults for my app.

I will give you an example. Because my app uses Content Filter (Filter Data Provider has sandbox restrictions), I wasn't able to use FileManager or CoreData for storing some information from Filter Data Provider because access was denied. So I had to use UserDefaults.

  • can't share the whole idea, but in a nutshell I needed to store resolved ips from flows

Additionally, I use UserDefaults for some data that is accessed from MDM config profile and shared to UI components via KVO

Tried to add Address Sanitizer but received the same runtime issue as here

upd: fixed by disabling other diagnostics tools 🥲

Not sure if this is right, please correct me if I am wrong here. One of the possible causes for my issue could also be concurrent access to one memory address?

Ideally would probably be to rewrite some code with FileManager under AppGroup for large data?

One of the possible causes for my issue could also be concurrent access to one memory address?

Yes. Concurrency bugs can manifest as memory corruption.

could it be related to overuse of UserDefaults?

That’s unlikely. The UserDefaults API is not a common source of memory corruption issues.

I wasn't able to use FileManager or CoreData for storing some information from Filter Data Provider because access was denied. So I had to use UserDefaults.

Which provider is writing this data? And which provider is reading it?

A filter data provider should have read/write access to its own container. So, if you want to persist data within your filter data provider, any file system API should work for that.

OTOH, if you want to write data in one provider and read it in another, things get more complex.

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

Which provider is writing this data? And which provider is reading it?

In my case, Filter Control Provider writes data received from MDM configuration profile, then Filter Data Provider reads this data to use it for flow filtering. But my Filter Data Provider also writes some data about intercepted flow, that is later used for resolving them.

OTOH, if you want to write data in one provider and read it in another, things get more complex.

I think that's a great explanation for the problem I had with Core Data. Because my Content Filter is not limited by just two providers, I think target membership for custom controllers that add more logic to flow filtering could have granted access to these components for main target. Then, it makes sense why I was received errors for sandbox restrictions.

I guess my next steps would be ensuring that concurrent access is handled properly and maybe bringing back Core Data for Filter Data Provider

Thank you!

Accepted Answer
In my case, Filter Control Provider writes data received from MDM configuration profile, then Filter Data Provider reads this data to use it for flow filtering.

OK. That should be possible by putting the data into an app group. The control provider will have read/write access to that app group; the data provider will only be able to read it.

And, yes, you will need some sort of concurrency control there (-:

But my Filter Data Provider also writes some data about intercepted flow, that is later used for resolving them.

As long as this only needs to be read back by the data provider, you’re all good. Just put the data into the data provider’s container.

You still might need concurrency control though, although it’s only intra-process concurrency control. That is, multiple threads within the data provider might be accessing this data and you have to make sure they don’t stomp on each other.

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

OK. That should be possible by putting the data into an app group. The control provider will have read/write access to that app group; the data provider will only be able to read it.

And, yes, you will need some sort of concurrency control there (-:

Yep, that's exactly how I did it. My concurrency control for now is a shared container KVO with serial queue for read and async write for observed property within Content Filter scope.

And thank you for your previous response, it seems like the number of crashes decreased a lot for the new build, since I added some concurrency control for DNS Proxy Extension. It still requires some investigation but overall stability looks better

Hi, it's been a while but I just wanted to give a quick update on the app and ask a couple questions.

Ever since I changed the shared container access and data sharing mechanism between the targets, app doesn't seem to crash anymore with EXC_BREAKPOINT (SIGTRAP). However, issue with the app not being able to find a server still persists.

Connection 4: received failure notification
Connection 4: failed to connect 12:8, reason 18 446 744 073 709 551 615
Connection 4: encountered error(12:8)
Task <01313C44-8C0D-4B29-8924-AB530B062FB7>.<3> HTTP load failed, 0/0 bytes (error code: 18 446 744 073 709 550 613 [12:8])

Task <01313C44-8C0D-4B29-8924-AB530B062FB7>.<3> finished with error [18 446 744 073 709 550 613] Error Domain=NSURLErrorDomain Code=-1003 "A server with the specified hostname could not be found." UserInfo={_kCFStreamErrorCodeKey=8, NSUnderlyingError=0x10c64cc50 {Error Domain=kCFErrorDomainCFNetwork Code=-1003 "(null)" UserInfo={_kCFStreamErrorDomainKey=12, _kCFStreamErrorCodeKey=8, _NSURLErrorNWResolutionReportKey=Resolved 0 endpoints in 5ms using unknown from cache, _NSURLErrorNWPathKey=satisfied (Path is satisfied), interface: en0[802.11], ipv4, dns, uses wifi}}, _NSURLErrorFailingURLSessionTaskErrorKey=LocalDataTask <01313C44-8C0D-4B29-8924-AB530B062FB7>.<3>, _NSURLErrorRelatedURLSessionTaskErrorKey=(
    "LocalDataTask <01313C44-8C0D-4B29-8924-AB530B062FB7>.<3>"
), NSLocalizedDescription=A server with the specified hostname could not be found., NSErrorFailingURLStringKey=https://api_url, NSErrorFailingURLKey=https://api_url, _kCFStreamErrorDomainKey=12}

While investigating the issue, I found a couple articles for Network Extension guides from Apple. I took some advices from those articles regarding networking within the app with added Network Extensions:

  • have separate URL session configurations for each target
  • use timeouts for outgoing requests, etc.

But it didn't really change anything

Interesting thing is that before the issue with failed task occurs it prints out session protocols as ["-"], I guess it means that session failed to establish protocols for outgoing request.

Here are examples of URLSession configurations that I use for DNS Proxy Provider and my Main target

/// DNSProxy network service
public final class DNSProxyNetworkService: NSObject, Requestable, URLSessionTaskDelegate {
    static let shared = DNSProxyNetworkService()
    
    lazy var session: URLSession = {
        let config = URLSessionConfiguration.ephemeral

        return URLSession(
            configuration: config,
            delegate: self,
            delegateQueue: nil
        )
    }()    
}

extension DNSProxyNetworkService {
    public func urlSession(_ session: URLSession, task: URLSessionTask, didFinishCollecting metrics: URLSessionTaskMetrics) {
        let protocols = metrics.transactionMetrics.map { $0.networkProtocolName ?? "-" }
        Logger.statistics.debug("[DNSProxyNetworkService] – session protocols: \(protocols, privacy: .public)")
    }
}
/// MainTarget network service
public final class MainTargetNetworkService: NSObject, Requestable, URLSessionTaskDelegate {
    static let shared = MainTargetNetworkService()
    
    lazy var session: URLSession = {
        let config = URLSessionConfiguration.default

        return URLSession(
            configuration: config,
            delegate: self,
            delegateQueue: nil
        )
    }()
}

extension MainTargetNetworkService {
    public func urlSession(_ session: URLSession, task: URLSessionTask, didFinishCollecting metrics: URLSessionTaskMetrics) {
        let protocols = metrics.transactionMetrics.map { $0.networkProtocolName ?? "-" }
        Logger.statistics.debug("[MainTargetNetworkService] – session protocols: \(protocols, privacy: .public)")
    }
}

Note: this issue mostly occurs if the build is initiated from Xcode when the device already has app installed or during initial launch for the first build on the device

Would be grateful to hear any advices or suggestions for further investigation of this issue, thank you!

URLSession should work in a DNS proxy provider. The only specific gotcha I’m aware of is the App Sandbox, but that only applies to macOS. I’m presuming you’re on iOS, based on the crash reports you posted earlier. Let me know if that’s wrong.

Let’s drop down a layer. If you open a connection to the server over TCP using NWConnection, does that work?

Here’s a snippet of how you can try this:

class MyClass {

    var connectionQ: NWConnection? = nil
    
    func start() -> NWConnection {
        print("connection will start")
        let connection = NWConnection(to: .hostPort(host: "example.com", port: 80), using: .tcp)
        connection.stateUpdateHandler = { newState in
            print("connection did change state, new: \(newState)")
        }
        connection.start(queue: .main)
        return connection
    }
    
    func stop(connection: NWConnection) {
        print("connection will stop")
        connection.stateUpdateHandler = nil
        connection.cancel()
    }
    
    func startStop() {
        if let connection = self.connectionQ {
            self.connectionQ = nil
            self.stop(connection: connection)
        } else {
            self.connectionQ = self.start()
        }
    }
}

Replace example.com with the name of the server you’re trying to connection to. If the server only supports HTTPS, replace 80 with 443 and .tcp with .tls.

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

Yes, it seems to work fine with NWConnection

The app is designed to have optional resolver (system resolver/custom DoH server). If system resolver is in use, I use NWConnection, for DoH resolver it's HTTPS request with HTTP3 enabled (server only listens to HTTP3).

private func handleNewFlow(_ flow: NEAppProxyUDPFlow) -> Bool {
        Task(priority: .high) { [weak self] in
            await self?.handleNewFlow(flow)
        }
        return true
    }
    
    private func handleNewFlow(_ flow: NEAppProxyUDPFlow) async {
        do {
            try await flow.open(withLocalEndpoint: flow.localEndpoint as? NWHostEndpoint)
            
            let datagrams = try await flow.readDatagrams()

            let results = await datagrams.parallelMap { [weak self] in
                let connection = DatagramConnection($0)
                
                let connectionType = self?.connectionType
                let resolverType = self?.resolverType
                let serverStatus = self?.serverStatus
                          
                return await connection.transferData(
                    status: serverStatus,
                    resolverType: resolverType,
                    connectionType: connectionType
                )
            }
                    
            try await flow.writeDatagrams(results)
                                    
            flow.closeReadWithError(nil)
            flow.closeWriteWithError(nil)
        } catch {           
            flow.closeReadWithError(error)
            flow.closeWriteWithError(error)
        }
    }

In transferData there is a conditional call for

private func resolveDatagramWithSystem(datagram: Datagram) async -> Data?

or

private func resolveDatagramWithDoH(
        question: DNSQuestion,
        packet: DNSRR,
        resolver: ProxyResolverType?,
        server: ServerType?
    ) async -> Data?

Here is how my resolveDatagramWithSystem looks like

private func resolveDatagramWithSystem(datagram: Datagram) async -> Data? {
        do {
            var connection: NWConnection
            
            switch datagram.endpoint {
            case let .host(hostEndpoint):
                guard let port = Network.NWEndpoint.Port(hostEndpoint.port) else {
                    throw NSError.unknown(thrownBy: Self.self)
                }
                let host = Network.NWEndpoint.Host(hostEndpoint.hostname)
                connection = NWConnection(host: host, port: port, using: .udp)
            case .bonjour:
                throw NSError.unknown(thrownBy: Self.self)
            }
            try await connection.establish(on: .datagramConnection)
            try await connection.send(content: datagram.packet)
            let message = try await connection.receiveMessage()
            let messageData = message.completeContent
            
            return messageData
        } catch {
            Logger.statistics.error("[DatagramConnection] - Failed to handle connection: \(error, privacy: .public)")
        }
        
        return nil
    }
Yes, it seems to work fine with NWConnection

Right, but I was asking about connecting to your HTTP server with NWConnection. And it’s hard to tell whether that’s working because the server is HTTP/3 only, and hence won’t accept TCP+TLS connections on port 443.

When URLSession fails, how reproducible in that? For all requests? Or just in specific circumstances?

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

A server with the specified hostname could not be found exception
 
 
Q