Sockets created in NE app are bound to utun interface on Ventura 13

First sorry for the long message, but I wanted to give as much info as possible.

I have a VPN app that uses Network Extension and OpenVPN on Ventura (13.1). Before Ventura everything worked fine.

I have a problem with sockets created from network extension. The sockets created into the extension are assigned on the tunnel interface (utun3 in my case).

Scenario:

  1. Start the VPN (includeAllNetworks=true) => OS creates utun3 and enters into startTunnel from NE app
  2. On extension the app connects to VPN server.
  3. Call setTunnelNetworkSettings with the new configuration and when finished calls the completionBlock from startTunnel and reasseting = false
  4. After 2 seconds create a new socket (C API) into NE and connect => socket is bound to tunnel interface.
# lsof output wifi ip=192.168.0.163 utun3 IP=10.7.1.4
8u     IPv4 0xb394555904672715        0t0                 TCP 192.168.0.163:60266->VPN_SERVER_IP (ESTABLISHED)
9u     IPv4 0xb394555904673d35        0t0                 TCP 10.7.1.4:60284->SOME_WEBSITE_IP:http (ESTABLISHED)

From this point on, all the sockets created from the NE app are bounded to the tunnel, instead of wifi interface. The tunnel must be restarted to work again.

What "helps" to fix this is to call with delay at least of 0.5 (less is not working) the completion block from startTunnel and reasseting=false, after VPN is connected, into the completion block from setTunnelNetworkSettings:

// connection to VPN server is made
setTunnelNetworkSettings(networkSettings) { error in
	DispatchQueue.main.asyncAfter(deadline: .now() + 0.5) {
		start_tunnel_completion_block()
		reasserting = false
		DispatchQueue.main.async {
              self.connectToSomeSocket()
		}
	}

I've activated the extra loggins for necp, and I've saw that necp creates a new rule (if no delay is used) for the VPN app that has to bind to utun3.

My system configuration is :

  • wifi/en0 interface has index 15
  • utun3, created when the tunnel starts is index 22.
  • in this case Network extension app, tunnel app, has PID 37567.
  • Policy ID is 14569, which is created after the app calls the completion block from startTunnel and reasseting=false

Necp log (not the same with the lsof from above):

# While connection to VPN server, the socket matched other rules that have interface index 22 (en0)
error	10:44:55.101389+0100	kernel	necp_socket_find_policy_match_with_info_locked: DATA-TRACE <SOCKET>: EXAMINING - policy id=14557 session_order=2002 policy_order=10806 result=IP_TUNNEL (cond_policy_id 0)
error	10:44:55.101392+0100	kernel	necp_socket_check_policy: DATA-TRACE <SOCKET>: ------ matching <NECP_KERNEL_CONDITION_BOUND_INTERFACE> <value (15 / 0xF) (0 / 0x0) (0 / 0x0) input (15 / 0xF) (0 / 0x0) (0 / 0x0)>
error	10:44:55.101397+0100	kernel	necp_socket_check_policy: DATA-TRACE <SOCKET>: ------ matching <NECP_KERNEL_CONDITION_APP_ID> <value (66309 / 0x10305) (0 / 0x0) (0 / 0x0) input (66309 / 0x10305) (0 / 0x0) (0 / 0x0)>
error	10:44:55.101401+0100	kernel	necp_socket_check_policy: DATA-TRACE <SOCKET>: ------ matching <NECP_KERNEL_CONDITION_PID> <value (37567 / 0x92BF) (0 / 0x0) (0 / 0x0) input (37567 / 0x92BF) (0 / 0x0) (0 / 0x0)>
error	10:44:55.101404+0100	kernel	necp_socket_find_policy_match_with_info_locked: DATA-TRACE <SOCKET <private>>: MATCHED POLICY - proto 6 port <local 53511/53511 remote 1231/1231> <drop-all order 11001> <pid=37567 Application 66309 Real Application 66309 BoundInterface 15> (policy id=14557 session_order=2002 policy_order=10806 result=IP_TUNNEL)

# after connected to VPN server and called the completion block from `startTunnel` and `reasseting=false`
....
default	10:44:55.511326+0100	kernel	necp_kernel_socket_policy_add: Added kernel policy: socket, id=14569, mask=4202
...
default	10:44:55.512624+0100	kernel	necp_kernel_socket_policies_dump_all: 	  5. Policy ID: 14569	Process: nesessionm	Order: 2002.10806	Mask: 4202	Result: IPTunnel (utun3)
....

error	10:45:12.225306+0100	kernel	necp_socket_find_policy_match_with_info_locked: DATA-TRACE <SOCKET>: EXAMINING - policy id=14569 session_order=2002 policy_order=10806 result=IP_TUNNEL (cond_policy_id 0)
error	10:45:12.225308+0100	kernel	necp_socket_check_policy: DATA-TRACE <SOCKET>: ------ matching <NECP_KERNEL_CONDITION_BOUND_INTERFACE> <value (22 / 0x16) (0 / 0x0) (0 / 0x0) input (22 / 0x16) (0 / 0x0) (0 / 0x0)>
error	10:45:12.225312+0100	kernel	necp_socket_check_policy: DATA-TRACE <SOCKET>: ------ matching <NECP_KERNEL_CONDITION_APP_ID> <value (66309 / 0x10305) (0 / 0x0) (0 / 0x0) input (66309 / 0x10305) (0 / 0x0) (0 / 0x0)>
error	10:45:12.225316+0100	kernel	necp_socket_check_policy: DATA-TRACE <SOCKET>: ------ matching <NECP_KERNEL_CONDITION_PID> <value (37567 / 0x92BF) (0 / 0x0) (0 / 0x0) input (37567 / 0x92BF) (0 / 0x0) (0 / 0x0)>
error	10:45:12.225320+0100	kernel	necp_socket_find_policy_match_with_info_locked: DATA-TRACE <SOCKET <private>>: MATCHED POLICY - proto 6 port <local 53537/53537 remote 1231/1231> <drop-all order 11001> <pid=37567 Application 66309 Real Application 66309 BoundInterface 22> (policy id=14569 session_order=2002 policy_order=10806 result=IP_TUNNEL)
default	10:45:12.225327+0100	kernel	necp_socket_find_policy_match: Socket Policy: <private> (BoundInterface 22 Proto 6) Policy 14569 Result 6 Parameter 22

Do you have any suggestions why would kernel bound sockets from NE app to the utun interface or how to future investigate this?

And maybe any suggestions how to properly fix this, instead of adding delay to setTunnelNetworkSettings?

Thanks

Answered by smaryus in 751065022

Bug was fixed in macos 13.3.

And maybe any suggestions how to properly fix this, instead of adding delay to setTunnelNetworkSettings?

Have you tried avoiding sockets altogether and moving to the Network Extension provided API such as NWTCPConnection or NWUDPSession? Likewise, NWConnection or nw_connection_t is also another great alternative.

Thanks for suggestion, unfortunately it is a little complicate to change to that API, because it would require to change some internals from openvpn.

I did give it a try and it worked with createTCPConnection, but I see in kernel the same problem for the socked created for NWTCPConnection.

Meaning when the socket is created for NWTCPConnection by the kernel, necp matches it on the "broken" rule, and it is bound to utun interface, but later when it is into the connected state it is bound to en0 interface. So what I think it happens, maybe I'm wrong, createTCPConnection creates the socket with the utun interface because necp, and then createTCPConnection internally bounds it to the en0 interface. Because the socket is created in NE app and cannot be bound to utun interface.

Bellow is the match policy (I'm connecting on a remote server on port 80):

error	15:32:36.187216+0100	kernel	necp_socket_find_policy_match_with_info_locked: DATA-TRACE <SOCKET>: EXAMINING - policy id=16174 session_order=2002 policy_order=10806 result=IP_TUNNEL (cond_policy_id 0)
error	15:32:36.187220+0100	kernel	necp_socket_check_policy: DATA-TRACE <SOCKET>: ------ matching <NECP_KERNEL_CONDITION_BOUND_INTERFACE> <value (22 / 0x16) (0 / 0x0) (0 / 0x0) input (22 / 0x16) (0 / 0x0) (0 / 0x0)>
error	15:32:36.187223+0100	kernel	necp_socket_check_policy: DATA-TRACE <SOCKET>: ------ matching <NECP_KERNEL_CONDITION_APP_ID> <value (66373 / 0x10345) (0 / 0x0) (0 / 0x0) input (66373 / 0x10345) (0 / 0x0) (0 / 0x0)>
error	15:32:36.187227+0100	kernel	necp_socket_check_policy: DATA-TRACE <SOCKET>: ------ matching <NECP_KERNEL_CONDITION_PID> <value (45946 / 0xB37A) (0 / 0x0) (0 / 0x0) input (45946 / 0xB37A) (0 / 0x0) (0 / 0x0)>
error	15:32:36.187231+0100	kernel	necp_socket_find_policy_match_with_info_locked: DATA-TRACE <SOCKET 0>: MATCHED POLICY - proto 6 port <local 59836/59836 remote 80/80> <drop-all order 11001> <pid=45946 Application 66373 Real Application 0 BoundInterface 22> (policy id=16174 session_order=2002 policy_order=10806 result=IP_TUNNEL)

thanks again for the suggestion. I would have used it, but we have too many 3rd party libraries, not only openvpn, that work with sockets.

Connections that use NWTCPConnection are not ultimately sockets but still exist in user space and are solely created to be used in the situation you are describing, for in provider networking for a Network Extension. If this API is working for you then I would not worry about what the NECP policy rules are logging out in the console. If this API is not working for you and you can make this work with a port like 443 (instead of 80) or another port that uses a form of TLS in your tunnel authentication then this may be grounds for a bug report depending upon the situation.

I'm sorry, but not quite sure what you mean.

Today I've made some more testing and got some more interesting results.

This is how I've rested the NNE app The NE app connects to the VPN server (using socket/connect). Then setTunnelNetworkSettings is called. Into the callback from setTunnelNetworkSettings 2 connections are create after 0.5 seconds (to be sure OS finished setting up everything):

  1. a socket with C-API (socket/connect) to a IP (not VPN server) and port 80, and
  2. a createTCPConnection and wait to connect to another IP (not VPN server) and port 80. I'm using addObserver to know when it is connected.

Example code

setTunnelNetworkSettings(networkSettings) { error in
	DispatchQueue.main.async {
			complete_from_start_tunnel(error)
			self.reasserting = false
			DispatchQueue.main.asyncAfter(deadline: .now() + 0.5) {
			// create socket with C-API for 85.120.19.5:80
			// create tcp connection with createTCPConnection(85.120.19.250, 80)
			}
		}
	}
}

In the same time I'm checking with lsof get the information.

In both cases in the end there is an socket created and both sockets use utun interface.

IPv4 0xb394555908552715        0t0                 TCP 192.168.0.163:52596->194.233.50.248:1231 (ESTABLISHED) <--- VPN server
IPv4 0xb394555908506715        0t0                 TCP 10.7.0.7:52617->85.120.19.5:80 (ESTABLISHED)  <-- this is the IP for C-API (socket/connect)
IPv4 0xb394555908443225        0t0                 TCP 10.7.0.7:52618->85.120.19.250:80 (SYN_SENT) <-- this is the IP for createTCPConnection

Both connections work fine, can send data/read data as long as the VPN socket is alive. If for some reason I need to recreate the VPN socket nothing works anymore, because any socket created from this point on is using the utun interface. (this is made with C-API and I cannot change that...).

If I add a delay to the setTunnelNetworkSettings completion block, everything works fine and all the sockets are on en0.

setTunnelNetworkSettings(networkSettings) { error in
	DispatchQueue.main.asyncAfter(deadline: .now() + 0.5) {

thanks

a socket with C-API (socket/connect) to a IP (not VPN server) and port 80, and

a createTCPConnection and wait to connect to another IP (not VPN server) and port 80.

What is the need for creating connections inside the Network Extension provider that are over port 80 and not to your VPN server?

In this case the port 80 it is just an example, so I can check if the sockets are working or not.

In out production application we do have the case that sometimes we need to connect to more servers, and all the sockets from tunnel must not go through VPN server, they have their own packages encryption.

But my original question is why are the sockets bound to utun interface from NE app, no matter what port or IP are connecting?

  • Is there a new limitation on ventura and NE app must create only one socket for the entire app lifetime? Even if I need in some cases to recreated the socket to connect to VPN server, because it will not work? This used to work fine in previous OS versions.
  • And why does everything works fine if I add a delay of 0.5 to setTunnelNetworkSettings?

thanks

But my original question is why are the sockets bound to utun interface from NE app, no matter what port or IP are connecting?

I am not sure what the exact change was here that is causing this behavior, but from a VPN tunnel perspective you don't want traffic going outside of the tunnel anyways.

Regarding:

Is there a new limitation on ventura and NE app must create only one socket for the entire app lifetime? Even if I need in some cases to recreated the socket to connect to VPN server, because it will not work?

No. For this workflow I will continue to recommend the in-provider network APIs I mentioned above instead of going the sockets route.

Posting a bug number here, FB11965447, for the benefit of both Matt and Future Quinn™.

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

Accepted Answer

Bug was fixed in macos 13.3.

Sockets created in NE app are bound to utun interface on Ventura 13
 
 
Q