NWConnection never send failed status

Hello everyone,


I'm currently creating a Swift Framework to add TCP capabilities to a C++ application.

And I have issues I don't meet when I test my Swift code out of a framework.


To be brief: I never receive failed state updates, the stateUpdateHandler is never called with a failed State or with a cancelled State. This happens whatever the way the connection is cut, either when I kill the client application or when I call cancel() client side. I receive waiting and ready states properly and my connections work fine, which puzzles me.


To describe precisely my configuration, both my client and my server are using a Swift framework using network.framework. They are both C++ applications, running on the same machine (development environment). Communication between the C++ and Swift part of the application is fine.

I manage properly to create connections and send data over them. If I cancel the connection either client or server side, the other side never gets notified. Same behavior occurs if I kill the client or the server. If I test my code on a simple Swift project, it works fine. The Swift code must be inside a framework, in my opinion, for the issue to happen.


I've looked at everything during 2 days, and I don't think I make any obvious mistake. The fact that I receive all the other states properly make me think there's an issue with the specific way failed and cancelled statuses are handled.


Thanks for any help you can give me.

Post not yet marked as solved Up vote post of SuperBidi Down vote post of SuperBidi
4.2k views

Replies

Do you have a receive pending? If not, set one up. My experience is that

NWConnection
will only notice dead connections if there’s a receive pending.

Share and Enjoy

Quinn “The Eskimo!”
Apple Developer Relations, Developer Technical Support, Core OS/Hardware

let myEmail = "eskimo" + "1" + "@apple.com"

Hi Eskimo,


Thanks for your answer. I've modified a bit my code, and I listen on both ends all the time now.

It's better, still, it's a bit weird:

- If I cancel a connection, I get the State cancelled on the connection I cancelled, but nothing on the other side. As it's TCP, I should be able to know when a connection is closed.

- I get the failed State if I try to send something on the cancelled connection on either end. So, I can work with that, but I have no way to know if a user is disconnected properly or if I have a connection failure somewhere. It's a bit annoying without being blocking.

Cancelling a connection should trigger the shutdown sequence at the TCP level. We have

forceCancel
to opt in to the opposite behaviour (although even that won’t avoid the TCP shutdown, but it will avoid TLS shutdown).

I tried this with a small test project here in my office and I’m not seeing the behaviour you’re seeing. Weird.

The next thing to do here is use a packet trace to divide the problem in half. Is the problem that the peer calling

cancel
isn’t shutting down the TCP connection? Or that the remote peer isn’t noticing that shutdown? That is, do you see the FIN/FIN-ACK/ACK sequence on the wire.

For a basic introduction to packet traces, see Recording a Packet Trace.

Share and Enjoy

Quinn “The Eskimo!”
Apple Developer Relations, Developer Technical Support, Core OS/Hardware

let myEmail = "eskimo" + "1" + "@apple.com"

Hi Eskimo,


I've found something strange (in my opinion).

I don't receive State changes, but I receive errors on my receive function.

I was listening to State changes to handle my connection state but I also have to consider that an error in the receive function means that the connection may be down.


It raises new questions:

- What errors mean that the connection is down, and what errors mean I should retry my receive? Currently, I'm cancelling the connection any time I get an error, but I may end up with my connections being less resilient than they should.

- Do I have to call cancel on both ends to be sure I properly finish the TCP close?

I also have to consider that an error in the receive function means that the connection may be down.

Right. You also have to pay attention to EOF in your receive callback, because most TCP connections close cleanly, and that’s how a clean close is indicated.

What errors mean that the connection is down, and what errors mean I should retry my receive?

For TCP you’d never retry a receive.

Do I have to call cancel on both ends to be sure I properly finish the TCP close?

I don’t think it’s required to finish the TCP close, but it’s something you want to be doing regardless. The alternative would be to release your last reference to the

NWConnection
and hope for the best [1].

Share and Enjoy

Quinn “The Eskimo!”
Apple Developer Relations, Developer Technical Support, Core OS/Hardware

let myEmail = "eskimo" + "1" + "@apple.com"

[1] For

NWConnection
I believe that will actually work, because the API requires you to maintain a reference to the connection, but in general this is bad form for I/O channels. There are lots of I/O APIs where releasing the I/O channel does not clean things up properly, the most obvious one being classic UNIX file descriptors.

Thanks a lot!


I must admit I have hard time finding good information on the Network.framework in Swift. I'm using a lot of code I find on GitHub, but I have no guarantee it's properly written.

I'll continue implementing my framework, I hope I won't face other issues.

I'm going to dig this back up because it's directly related to my own questions.

The context:

==

@SuperBidi: [When the connection is gracefully closed by the other side....] I don't receive State changes, but I receive errors on my receive function. I was listening to State changes to handle my connection state but I also have to consider that an error in the receive function means that the connection may be down.

@eskimo: Right. You also have to pay attention to EOF in your receive callback, because most TCP connections close cleanly, and that’s how a clean close is indicated.

So with that in mind, here's what I'm doing and seeing:

  • I open a connection to a remote, the NWConnection state is .ready
  • (misc things may happen)
  • I then receive() to wait for any incoming data.
  • The remote side gracefully closes the connection (FIN/FIN-ACK)
  • The receive() callback comes back with an error NWError.posix(.ENODATA) (I assume this is the exact "EOF" @eskimo mentioned)
  • The NWConnection does not change states. It is still .ready.

What's not clear to me is why it's still considered "ready". It seems like conceptually the concept of "ready" doesn't necessarily mean "connected". And The key question at this moment is, should the local side do anything, or just sit and wait for the local side to send before ever doing anything with the connection.

Now let's say, my local side wants to send data to the remote. The connection is "ready" so:

  • I call send(), the callback is called with no error
  • Simultaneously the NWConnection internally recognizes the connection has been reset ([connection] nw_socket_handle_socket_event [C4:2] Socket SO_ERROR [54: Connection reset by peer])
  • The NWConnection's state now transitions to .failed.

What I'm left with here is a message that says it got through to the remote before the connection failed, but it didn't.

It seems as if when the ENODATA error is passed into the receive() callback, then the connection should be explicitly cancelled, so the local side knows the connection is not established anymore? But then why doesn't the framework itself do this?

How is this supposed to handled correctly?

Thanks.

I then receive() to wait for any incoming data.

What does this receive(…) look like?

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

I am using the same approach as in the TicTacToe example https://developer.apple.com/documentation/network/building_a_custom_peer-to-peer_protocol

connection.stateUpdateHandler = { [weak self] (newState) in
	guard let self = self else { return }
	
	switch newState {
	case .ready:
		...
		self.receiveNextMessage()
	case .failed(let err):
		print("Device TCP connection failed: \(err)")
		...
	case .cancelled:
		print("Device TCP connection cancelled")
		...
	default:
		break
	}
}



private func receiveNextMessage() {
	guard let connection = connection else { return }
	
	connection.receiveMessage { [weak self] (content, context, isComplete, error) in
		guard let self = self else { return }
		
		if let content = content {
			...
			
		} else if let error = error {
			switch error {
			...
			case NWError.posix(.ENODATA):
				// Without forceCancel, the connection state does not change until I later try to send data
				self.connection?.forceCancel()
				break
					
			default:
				print("Receive error: \(error)")
			}
		}
		
		if error == nil {
			self.receiveNextMessage()
		}
	}
}

I am also trying out the Tic Tac Toe app and I want to know how to check for invalid passcode. I am able to get everything working with the correct code, but when I enter an invalid code, I want to close out the connection, but I am unable to act on the TLS PSK authentication. Does anyone know how to handle events on an invalid TLS handshake?

I am using this code in a convenience init for NWParameters.

        var authenticationCode = HMAC<SHA256>.authenticationCode(for: "mysharedsecret".data(using: .utf8)!, using: authenticationKey)


        let authenticationDispatchData = withUnsafeBytes(of: &authenticationCode) { (ptr: UnsafeRawBufferPointer) in
            DispatchData(bytes: ptr)
        }
     
        sec_protocol_options_add_pre_shared_key(tlsOptions.securityProtocolOptions,
                                                authenticationDispatchData as __DispatchData,
                                                stringToDispatchData("mysharedsecret")! as __DispatchData)
        
  
        sec_protocol_options_append_tls_ciphersuite(tlsOptions.securityProtocolOptions,
                                                    tls_ciphersuite_t(rawValue: TLS_PSK_WITH_AES_128_GCM_SHA256)!)

I assumed that the Connection Failed state would execute, but it does not.

Here is what I received when I send an invalid code:

2023-04-25 15:03:22.703980-0400 objcbonjour[26277:983754] [connection] nw_socket_handle_socket_event [C1.1.8.1:3] Socket SO_ERROR [54: Connection reset by peer]
2023-04-25 15:03:22.710169-0400 objcbonjour[26277:983754] [boringssl] boringssl_context_handle_fatal_alert(1991) [C1.1.10.1:2][0x7fe32e1337a0] read alert, level: fatal, description: bad record mac
2023-04-25 15:03:22.710933-0400 objcbonjour[26277:983754] [boringssl] boringssl_session_handshake_incomplete(88) [C1.1.10.1:2][0x7fe32e1337a0] SSL library error
2023-04-25 15:03:22.711191-0400 objcbonjour[26277:983754] [boringssl] boringssl_session_handshake_error_print(43) [C1.1.10.1:2][0x7fe32e1337a0] Error: 140613710266504:error:100003fc:SSL routines:OPENSSL_internal:SSLV3_ALERT_BAD_RECORD_MAC:/Library/Caches/com.apple.xbs/Sources/boringssl_Sim/ssl/tls_record.cc:594:SSL alert number 20
2023-04-25 15:03:22.711388-0400 objcbonjour[26277:983754] [boringssl] nw_protocol_boringssl_handshake_negotiate_proceed(771) [C1.1.10.1:2][0x7fe32e1337a0] handshake failed at state 12288: not completed
2023-04-25 15:03:22.712345-0400 objcbonjour[26277:983754] [] nw_protocol_default_input_finished called with null protocol->default_input_handler
2023-04-25 15:03:22.712959-0400 objcbonjour[26277:983754] [] nw_protocol_default_input_finished called with null protocol->default_input_handler, dumping backtrace:
        [x86_64] libnetcore-3100.102.1
    0   Network                             0x00007ff8058fe557 __nw_create_backtrace_string + 135
    1   Network                             0x00007ff805628528 _ZL34nw_protocol_default_input_finishedP11nw_protocolS0_ + 376
    2   libboringssl.dylib                  0x00007ff804fbfc7c nw_protocol_boringssl_input_finished + 301
    3   Network                             0x00007ff805ab2045 _ZL29nw_socket_handle_socket_eventP9nw_socket + 1445
    4   libdispatch.dylib                   0x000000010acec7ec _dispatch_client_callout + 8
    5   libdispatch.dylib                   0x000000010acefa44 _dispatch_continuation_pop + 836
    6   libdispatch.dylib                   0x000000010ad07851 _dispatch_source_invoke + 2226
    7   libdispatch.dylib                   0x000000010acf6c76 _dispatch_workloop_invoke + 2692
    8   libdispatch.dylib                   0x000000010ad03982 _dispatch_workloop_worker_thread + 962
    9   libsystem_pthread.dylib             0x00007ff837749c55 _pthread_wqthread + 327
    10  libsystem_pthread.dylib             0x00007ff837748bbf start_wqthread + 15
2023-04-25 15:03:27.691437-0400 objcbonjour[26277:983753] [connection] nw_socket_handle_socket_event [C1.1.7.1:3] Socket SO_ERROR [60: Operation timed out]
2023-04-25 15:03:29.707140-0400 objcbonjour[26277:983754] [connection] nw_socket_handle_socket_event [C1.1.9.2:3] Socket SO_ERROR [60: Operation timed out]
2023-04-25 15:03:31.714222-0400 objcbonjour[26277:983754] [connection] nw_socket_handle_socket_event [C1.1.9.3:3] Socket SO_ERROR [60: Operation timed out]
2023-04-25 15:03:33.723231-0400 objcbonjour[26277:983754] [connection] nw_socket_handle_socket_event [C1.1.9.4:3] Socket SO_ERROR [60: Operation timed out]
2023-04-25 15:03:33.725218-0400 objcbonjour[26277:983754] [connection] nw_connection_add_timestamp_locked_on_nw_queue [C1] Hit maximum timestamp count, will start dropping events
2023-04-25 15:03:35.729161-0400 objcbonjour[26277:983754] [connection] nw_socket_handle_socket_event [C1.1.9.5:3] Socket SO_ERROR [60: Operation timed out]
2023-04-25 15:03:37.734531-0400 objcbonjour[26277:983754] [connection] nw_socket_handle_socket_event [C1.1.9.6:3] Socket SO_ERROR [60: Operation timed out]