Advanced UDP with Network.framework

I finally found a time to experiment with Network.framework and I find the experience very pleasant. The API looks well thought out and is a pleasure to work with.

My app (on the App Store for around 13 years) uses UDP networking to create a mesh between multiple devices where each device acts as a server and client at the same time.

It does that by creating one UDP socket on each device bound to a *:<port> and then uses that socket to sendmsg to other peers. That way all peers can communicate with each other using their well known <port>. This all works great and fine.

To test the performance and verify the functionality I have multiple XCTestCase scenarios where I create multiple peers and simulate various communications and verify their correctness. Within the XCTestCase process that means creating multiple underlying sockets and then bind them to multiple local random <port>s. Works great.

Now I'm trying to port this functionality to Network.framework.

Let's assume two peers for now. Code simplified.

Prepare NWParameter instances for both peers, plain UDP for now.

// NOTE: params for peer 1
let p1 = NWParameters(dtls: nil, udp: .init())
p1.requiredLocalEndpoint = .hostPort(host: "0.0.0.0", port: 2222)
p1.allowLocalEndpointReuse = true

// NOTE: params for peer 2
let p2 = NWParameters(dtls: nil, udp: .init())
p2.requiredLocalEndpoint = .hostPort(host: "0.0.0.0", port: 3333)
p2.allowLocalEndpointReuse = true

Create NWListeners for each peer.

// NOTE: listener for peer 1 - callbacks omitted for brevity
let s1 = try NWListener(using: parameters)
s1.start(queue: DispatchQueue.main)

// NOTE: listener for peer 2 - callbacks omitted for brevity
let s2 = try NWListener(using: parameters)
s2.start(queue: DispatchQueue.main)

The listeners start correctly and I can verify that I have two UDP ports open on my machine and bound to port 2222 and 3333. I can use netcat -u to send UDP packets to them and correctly see the appropriate NWConnection objects being created and all callbacks invoked. So far so good.

Now in that XCTestCase, I want to exchange a packets between peer1 and peer2. So I will create appropriate NWConnection and send data.

// NOTE: connection to port 3333 from port 2222
let c1 = NWConnection(host: "127.0.0.1", port: 3333, using: p1)
c2.start(queue: DispatchQueue.main)
// NOTE: wait for the c1 state .ready
c2.send(content: ..., completion: ...)

And now comes the problem.

The connection transitions to .preparing state, with correct parameters, and then to .waiting state with Error 48.

[L1 ready, local endpoint: <NULL>, parameters: udp, local: 0.0.0.0:2222, definite, attribution: developer, server, port: 3333, path satisfied (Path is satisfied), interface: en0[802.11], ipv4, dns, uses wifi, service: <NULL>]

[L2 ready, local endpoint: <NULL>, parameters: udp, local: 0.0.0.0:3333, definite, attribution: developer, server, port: 2222, path satisfied (Path is satisfied), interface: en0[802.11], ipv4, dns, uses wifi, service: <NULL>]


nw_socket_connect [C1:1] connectx(6 (guarded), [srcif=0, srcaddr=0.0.0.0:2222, dstaddr=127.0.0.1:3333], SAE_ASSOCID_ANY, 0, NULL, 0, NULL, SAE_CONNID_ANY) failed: [48: Address already in use]
nw_socket_connect [C1:1] connectx failed (fd 6) [48: Address already in use]
nw_socket_connect connectx failed [48: Address already in use]

state: preparing connection: [C1 127.0.0.1:2222 udp, local: 0.0.0.0:2222, attribution: developer, path satisfied (Path is satisfied), interface: lo0]
state: waiting(POSIXErrorCode(rawValue: 48): Address already in use) connection: [C1 127.0.0.1:3333 udp, local: 0.0.0.0:2222, attribution: developer, path satisfied (Path is satisfied), interface: lo0]

I believe this happens because the connection c1 essentially tries to create under the hood a new socket or something instead of reusing the one prepared for s1.

I can make the c1 work if I create the NWConnection without binding to the same localEndpoint as the listener s1. But in that case the packets sent via c1 use random outgoing port.

What am I missing to make this scenario work in Network.framework ?

P.S. I was able to make it work using the following trick:

  1. bind the s1 NWListener local endpoint to ::2222 (IPv6)
  2. connect the c1 NWConnection to 127.0.0.1:3333 (IPv4)

That way packets on the wire are sent/received correctly.

I would believe that this is a bug in the Network.framework. I can open a DTS if more information is needed.

Accepted Reply

On each device you bind one UDP socket to a random port. And that socket/port is used for all outgoing and incoming communication with the other peers.

OK. In theory that should be compatible with Network framework’s UDP support.

But it breaks when Port1, and Port2 are used via NWListener or NWConnection in the same (UNIX) process and on the same networking interface.

Right. I think this is a known issue. It’s related to the issue discussed here, but it’s not exactly the same.

Consider the program pasted in below. This starts two connections, with the flow tuples:

  • localIP / 12345 / 93.184.216.34 / 23456

  • localIP / 12345 / 93.184.216.34 / 23457

This should be feasible because UDP flows are uniquely identified by their tuples, and these tuples are distinct.

However, when you run it you get this [1]:

connection 23456 did change state, new: preparing
connection 23456 did change state, new: ready
connection 23457 did change state, new: preparing
connection 23457 did change state, new: waiting(POSIXErrorCode(rawValue: 48): Address already in use)

The second connection is failing with EADDRINUSE.

This result confirms that your on-the-wire protocol won’t work with the current Network framework. You absolutely need to be able to start two connections to different peers with the same source port. Given that, I encourage you to file your own bug. Please post your bug number, just for the record

Note that doesn’t involve NWListener at all; it’s just two outgoing connections. Adding NWListener into the mix is not going to improve things |-:

to make it work on the Apple Watch

You’ve read TN3135 Low-level networking on watchOS, right?

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

[1] I’m testing on 14.3.1 but this isn’t a new problem.


import Foundation
import Network

let localPort: NWEndpoint.Port = 12345

var connections: [NWConnection] = []

func startFlow(remotePort: UInt16) {
    let params = NWParameters.udp
    params.allowLocalEndpointReuse = true
    params.requiredLocalEndpoint = NWEndpoint.hostPort(host: "0.0.0.0", port: localPort)
    let conn = NWConnection(host: "93.184.216.34", port: .init(rawValue: remotePort)!, using: params)
    conn.stateUpdateHandler = { newState in
        print("connection \(remotePort) did change state, new: \(newState)")
    }
    conn.start(queue: .main)
    connections.append(conn)
}

func main() {
    startFlow(remotePort: 23456)
    startFlow(remotePort: 23457)
    dispatchMain()
}

main()

Replies

Quoting TN3151 Choosing the right networking API:

For UDP flows—where you have a stream of unicast datagrams flowing between two peers—Network framework is the best choice … However, not all UDP communication is that straightforward … If you need something that’s not supported by Network framework, use BSD Sockets.

The exact limits of Network framework’s UDP support are kinda fuzzy, but your situation:

My app … uses UDP networking to create a mesh between multiple devices where each device acts as a server and client at the same time.

is likely to be outside those limits.

I have a question about your real product, not your test case. You wrote:

It does that by creating one UDP socket on each device bound to a *:<port> and then uses that socket to sendmsg to other peers.

So, the packets on the wire contain UDP datagrams where both the local and remote port values are <port>?

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

Thanks @eskimo for your insights, as always.

We use http://enet.bespin.org for reliable peer to peer communication, it's very old and very reliable UDP library, not dissimilar to QUIC - except that it existed way before QUIC. It provides multiple independent streams of data (reliable and unreliable) - on top of one UDP flow. It was created for multiplayer games and is super easy to work with.

I have a nice and small Swift based API written for that, in production on Linux and iOS/macOS since 2014 - that I plan to open source at some point.

I want to move it over to Network.framework - to make it work on the Apple Watch where BSD sockets do not work, and to gain performance by moving away from BSD sockets. Networking guys (from Apple) I had a chance to talk to claimed that moving to Network.framework will make everything faster...

I will try to explain as easy as possible what the problem is. Please bear with me :)

On each device you bind one UDP socket to a random port. And that socket/port is used for all outgoing and incoming communication with the other peers. The UDP datagrams contain a simple protocol that handles streams, retransmits, datagram re-ordering, etc.

So imagine App1 opens UDP Socket1, and binds it to ::Port1.

App2 opens Socket2, and binds it to ::Port2.

App3 open Socket3, and binds ::Port3.

Communication from App1 to App2 goes via local Socket1 leaving the machine on Port1 and going to App2 via Socket2 and remote Port2.

Communication from App1 to App3 goes also via local Socket1 leaving the machine on Port1 and going to App3 via Socket3 and remote Port3.

Communication from App2 to App3 goes also via local Socket2 leaving the machine on Port2 and going to App3 via Socket3 and remote Port3.

Replies the other way around.

In the BSD sockets world this is possible even on the same networking interface, and in the same UNIX process, because Socket1, Socket2, and Socket3 are independent file descriptors in the kernel and sendmsg and recvmsg can use these sockets to send to and receive from any UDP address. Via IIRC something that is called "unconnected UDP socket" mechanism.

And you essentially use the same fd for both sendmsg and recvmsg.

In the Network.framework world, Listener1 will allow me to receive UDP datagrams on Port1. Listener2 on Port2, and Listener3 on Port3.

Connecting from App1 (Port1) to App2 (Port2) is only possible by creating new NWConnection with requiredLocalEndpoint set to Listener1.requiredLocalEndpoint.

This works fine, as long as Port2 is on another networking interface or another machine.

But it breaks when Port1, and Port2 are used via NWListener or NWConnection in the same (UNIX) process and on the same networking interface. Say for example in one XCTestCase.

So if I want to create a peer1 (NWListener1) and peer2 (NWListener2) both bound to localhost, and then open a NWConnection1to2 it will break.

If I bind peer1 to en0 and the peer2 to lo0 it will work.

I have no insight into how is the Network.framework implemented, and also this only affects NWListeners and NWConnections using UDP, in the same process, and on the same interface. So not a real world use case, but only a testing use case.

Sorry for the long and chaotic description. I have a simple swift package demonstrating the code ready, and if you want I can open a DTS request to investigate this further.

At the moment I am proceeding in a way that for XCTestCase I create one peer using Network.framework and the other peer using BSD sockets and shuffle data between them.

But I would love to verify all the functionality when both peers use Network.framework. Which is easier over lo0.

I am thinking along the lines of enumerating all interfaces on the machine and binding peer1 to one interface (say en0) and the other to another interface (say en1) so that they can communicate together successfully.

On each device you bind one UDP socket to a random port. And that socket/port is used for all outgoing and incoming communication with the other peers.

OK. In theory that should be compatible with Network framework’s UDP support.

But it breaks when Port1, and Port2 are used via NWListener or NWConnection in the same (UNIX) process and on the same networking interface.

Right. I think this is a known issue. It’s related to the issue discussed here, but it’s not exactly the same.

Consider the program pasted in below. This starts two connections, with the flow tuples:

  • localIP / 12345 / 93.184.216.34 / 23456

  • localIP / 12345 / 93.184.216.34 / 23457

This should be feasible because UDP flows are uniquely identified by their tuples, and these tuples are distinct.

However, when you run it you get this [1]:

connection 23456 did change state, new: preparing
connection 23456 did change state, new: ready
connection 23457 did change state, new: preparing
connection 23457 did change state, new: waiting(POSIXErrorCode(rawValue: 48): Address already in use)

The second connection is failing with EADDRINUSE.

This result confirms that your on-the-wire protocol won’t work with the current Network framework. You absolutely need to be able to start two connections to different peers with the same source port. Given that, I encourage you to file your own bug. Please post your bug number, just for the record

Note that doesn’t involve NWListener at all; it’s just two outgoing connections. Adding NWListener into the mix is not going to improve things |-:

to make it work on the Apple Watch

You’ve read TN3135 Low-level networking on watchOS, right?

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

[1] I’m testing on 14.3.1 but this isn’t a new problem.


import Foundation
import Network

let localPort: NWEndpoint.Port = 12345

var connections: [NWConnection] = []

func startFlow(remotePort: UInt16) {
    let params = NWParameters.udp
    params.allowLocalEndpointReuse = true
    params.requiredLocalEndpoint = NWEndpoint.hostPort(host: "0.0.0.0", port: localPort)
    let conn = NWConnection(host: "93.184.216.34", port: .init(rawValue: remotePort)!, using: params)
    conn.stateUpdateHandler = { newState in
        print("connection \(remotePort) did change state, new: \(newState)")
    }
    conn.start(queue: .main)
    connections.append(conn)
}

func main() {
    startFlow(remotePort: 23456)
    startFlow(remotePort: 23457)
    dispatchMain()
}

main()

Thanks @eskimo for the explanation. You nailed it completely. So I reused your example code which explains the bug much more clearly than my original description and submitted a bug FB13678278.

Our app (https://cloudbabymonitor.com) actually back deploys to iOS 12 at the moment so I hoped to switch to Network.framework with the next update that will support iOS 13+, but this bug is a stopper, as our app relies at opening multiple UDP flows from the same local port, and that seems to be broken, and I am not sure we will ever see this fixed in iOS 13+ (since iOS 12 is no longer receiving updates).

As for the watch, we do stream audio and video at the same time, so on the Watch that should satisfy the requirement for using the Network.framework. However it all comes down to the final experience. At the moment we can establish successfully UDP audio/video stream between two iPhones on WiFi in about 1 second - and that includes Bonjour discovery, DNS resolution, and everything else. I'm not sure yet I can get to 1 sec on the Watch from looking at the wrist and seeing live video, but I want to have the code ready and see it running and try to get it that fast purely for the engineering pleasure of it :) - if not for the happiness of our customers who beg for the Apple Watch live video feature now for years already :)

thanks for your help, Martin