How to debug "Network is unreachable" - Edit: added some details

I've implemented a VPN app with Packet Tunnel Provider, both for iOS and macOS (both apps use the same extension code, with some minor differences).

For the macOS app, I'm getting reports from users that sometimes they can't connect using the client and has to disconnect and reconnect again their network (wireless or wired) in order to connect again. From the app's logs, I can see the message "Network is unreachable".

This is not true because the users tried to ping multiple sites, and it went fine. So there is a reachable network.

More info:

They aren't at an IPv6 only environment (and anyway, my apps support IPv6).

I'm using BSD sockets.

Some users reported that when this happens, if they wait long enough ~10 min, they are able to connect again.

The iOS app has no such problems even it uses the same code.


So in order to solve it, how can I get more information? Can my app get the network information, and check the sockets state?

Can I get the system's logs and information and save it to my log file?

Is there something that might explain this behavior, like "the PacketTunnelProvider wasn't killed properly", the PacketTunnelProvider crashed and the cause networking problems?
Edit: Is it possible, that some of the following cases are the cause for the "Network is unreachable" error?

In all of those cases, the onDemand is enabled:


-The VPN disconnects and then immediately reconnect (on demands takes effect), but it does so without waiting 20 seconds, which triggers the known bug for macOS Packet Tunnel Provider

- The device enters sleep mode, but the VPN tries to connect again and again (on demands takes effect)

Replies

In this case I would try and avoid pre-flight checks for determining the status of the network to setup the VPN tunnel. This can get the VPN into potentially more scenarios where edge cases can show up. What I would attempt to do first in this scenario is start the VPN tunnel configuration and make sure that all of the criteria are met to start the tunnel. For example, do you have any NEOnDemandRuleConnect rules setup for matching the interface or SSID? If these rules are failing when the tunnel is started then this would be one possible point of investigation. To gain more insight on this, I would also try and reproduce this issue locally and also see if your server is logging these request to setup the tunnel. If you are seeing these logs hit your server then you know this is not a reachability issue. If you server does not have any information on the tunnel setup attempt then it points to a possible client side issue. I would first try and capture the logs on a running on a device in the Console.app to get more information on the scenario taking place. You could use the os_log API to provide more insight every step of the way in the Console.app. I would also try and compare the network environments between users successfully using the VPN against users that are having problems. Is there anything there that can be identified? Are there multiple proxies in the environment?



Basically, I would try and rule out whether request attempts are hitting your VPN server and then work your way from there.

Thanks. I'm not doing any pre-flight checks for determining the status of the network to setup the VPN tunnel, and regarding the NEOnDemandRuleConnect - I defined a rule to always connect (this is the closest thing to always on). I do have logs at the server, but it seems the client isn't sending any requests. So it's a client issue.


However, one of my users described something more specific, and he says it's persistent:

He was at the office, connected to the LAN, and the VPN was off.

Then he left the office, and walk home. The Mac entered sleep mode.

At his home, he opened the Mac and connected to his WIFI. Then he tried to turn on the VPN.

The VPN was stuck at 'connecting' phase. He needed to press the disconnect button, and then try reconnecting again. Only then it succeeds.

From the logs I saw that when he first clicked the connect button, the error "network is unreachable" appeared.

When he pressed the connect button the second time, everything was fine.


So I think this is the situation:

The VPN protocolConfiguration disconnectOnSleep is set. So when the Mac enters sleep mode, the system calls stopTunnelWithReason. When the Mac awakes, because my OnDemandRules, the OS would start the PacketTunnelProvider again.

However, it seems that sometimes in those cases, the network would be unreachable.

If I'm right, how can I solve it? And how can I print when the Mac enters sleep mode?

Yeah, it does sound like the VPN connection is getting hung when it tries to reconnect for some reason. I am not convinced it is a network reachability issue, but rather something else may be going on here because you stated, "When he pressed the connect button the second time, everything was fine."


You can certainly use the os_log API to help you diagnose this issue from the Console.app during all phases of this situation, whether awake or in sleep mode.

And how can I print when the Mac enters sleep mode?


Out of curiosity, in the situation that you described, if you received the "network is unreachable" error when connecting for the first time after the device is now awake, are you able to test a capturing this error, forcing a disconnect and then reconnecting programmatically? What I am trying to prove here is that something in your previous VPN connection is still being cached somehow, possibly an interface that no longer exists. If this ends up working this may be grounds for a bug report depending upon what you see. If this does not work then you may need to look elsewhere in your program.


Matt Eaton

DTS Engineering, CoreOS

meaton3 at apple.com

Thanks, I'll try to capture this error and search for more helpful logs.

But if it's really something like "VPN connection is still being cached somehow, possibly an interface that no longer exists" - if I'll add exit(0) at all the "correct" places (where the extension should stop), will it help?

Adding a exit(0) will terminate your program and I would bet this is not the desired action in this case. Instead the testing approach that I would recommend taking is to capture the incoming "Network is unreachable" error and programmatically force a disconnect on the VPN and then a new connect. If this works then this might mean there was a cached or hung VPN situation. If this does not work then you will need to look else where in your program to debug this issue.


| But if it's really something like "VPN connection is still being cached somehow, possibly an interface that no longer exists" - if I'll add

| exit(0) at all the "correct" places (where the extension should stop), will it help?


Matt Eaton

DTS Engineering, CoreOS

meaton3 at apple.com

1. "Adding a exit(0) will terminate your program and I would bet this is not the desired action in this case." - this is not accurate. I'll add the exit(0) only where the user asked to stop the tunnel, so instead of a "clean exit" from the extension, I'll just terminate it.

I read here at the forums that this is already acceptible behavior (because of the (knwon) bug that if you are trying to connect shortly after a disconnection, the connection will succeed, but you will get a disconnection after 20 sec).

So I'm guessing that using this approch is considered ok, but please correct me if it may cause harm in some way.


2. By " capture the incoming "Network is unreachable" error and programmatically force a disconnect on the VPN and then a new connect", you mean something like calling cancelTunnelWithError?


3. Thanks for the quick replies!

Yes, by capturing the error and performing a programmatic disconnect / reconnect you could register for the NEVPNStatusDidChange notification and call stopVPNTunnel. Then once the NEVPNStatus changed for this unique scenario test calling startVPNTunnelAndReturnError. This should give you a data point on how to best proceed debugging your scenario.


Matt Eaton

DTS Engineering, CoreOS

meaton3 at apple.com

K, I'll do it and report if I'll find something interesting.

Thanks again!