NetworkExtension-based NKE replacement in Catalina

Hi,


I am currently working on moving the Network Kernel Extensions-based socket filtering in our application to the newly available NetworkExtension-based filtering available in Catalina. There are a few points where I would like to have some more details, especially how some of our NKE-implemented requirements can be translated to the NetworkExtension, more precisely:

  • Abitility to monitor traffic volume in (near) real-time
  • Ability to disconnect / block a stream that is currently transferring data, i.e. that was previously allowed to start

Both of those enable our app to act as a kill switch when users are tethering and about to go over their data limit during a big file download for example.


This steered me towards using the new additions to Content Filter Providers with NEFilterDataProvider, but seeing some issues during testing I want to be sure if this is the correct choice or if I’m misusing the API and so would need to evaluate other options like transparent proxy.


In order to get the data count as the flow progresses, I’m returning a NEFilterNewFlowVerdict.filterDataVerdict() from the NEFilterDataProvider subclass handleNewFlow() and then NEFilterDataVerdict(passBytes:, peekBytes:) until the complete() methods are called. This seems to fulfil the requirements as I can use the readBytes.count to do the accounting and eventually return a .block() verdict in case I have to stop the flow mid-transfer. But:


1. The readBytes object seems to contain an unspecified header, apparently a size followed by something similar to the auditToken, for the first call when offset is 0. It seems its size has changed during the beta cycle and is currently 136 bytes. Are there any constants and / or definitions so that I can skip over the header or subtract its size?


Developer documentation states in https://developer.apple.com/documentation/networkextension/nefilterdataverdict/1619005-init?changes=latest_minor that 'the system does not pass the bytes to the destination until a "final" (i.e. allow/drop/remediate) verdict is returned’. This seems to not be the case on macOS at least in some test cases such as the one in the code below, using two nc processes to establish and keep an open TCP stream on the local machine, one can observe that bytes are transferred immediately from one to the other, even with the (passBytes:, peekBytes:) verdict. Methods appear to only be called once a sufficient number of bytes have been accumulated though (experimentally 136 in this case, same as the initial header size).


2. Is this use case of returning a filterData verdict for all packets until the end of the stream an intended use of the API (which fits our requirements well) or am I misusing a mechanism that should only be reserved for the first packets until a final decision is made?

In the latter case that would compromise our app's ability to stop an open stream and would need to use some other API for data accounting, if available.


3. During the above usage, networking performance takes a heavy hit in the beta when on high speed links (above 100Mbps), with a very large amount of time spent in NetworkExtension, CPU usage spiking to 100% and above for the filter extension, and 30 seconds pauses in traffic. Is this a known obvious beta issue that should be solved once release / optimized builds arrive or due to my specific usage? Our current NKE solution has negligible measured impact in the same conditions as it does not need copying to user-space.


Partial callstack for reference, all this runs in NetworkExtension in the filter app extension.

30.16 s  100.0% 70.00 ms   -[NEFilterDataExtensionProviderContext handleSocketSourceEventWithSocket:]
26.04 s   86.3% 11.00 ms    -[NEFilterDataSavedMessageHandler enqueueWithFlow:]
25.93 s   85.9% 16.00 ms     -[NEFilterDataSavedMessageHandler executeWithFlow:]
18.30 s   60.6% 6.00 ms      __74-[NEFilterDataExtensionProviderContext handleSocketSourceEventWithSocket:]_block_invoke_2.520
18.13 s   60.0% 12.00 ms       __74-[NEFilterDataExtensionProviderContext handleSocketSourceEventWithSocket:]_block_invoke.516
17.83 s   59.1% 28.00 ms        __74-[NEFilterDataExtensionProviderContext handleSocketSourceEventWithSocket:]_block_invoke.515
17.75 s   58.8% 35.00 ms         -[NEFilterDataExtensionProviderContext socketContentFilterWriteMessageWithControlSocket:socketID:drop:inboundPassOffset:inboundPeekOffset:outboundPassOffset:outboundPeekOffset:]
17.71 s   58.7% 17.71 s          write

4. Is there some more documentation about the lifecycle of NEFilterFlow objects? I tried using their hash value to key a dictionary of flow decisions (allow and continue to monitor = filterData verdict, or block verdict if not) based on our app- and process-level rules to avoid re-evaluating the rules for each packet in the open flow, more or less related to the way I've been the cookie in KEXT socket filters, but it seems that NEFilterFlow objects are somehow reused for other flows of unrelated processes which defeats the hash based keying. Note that this is not a showstopping issue as it apparently can be mitigated adding the auditToken in the hash.



I'm most concerned about 2 & 3 as those determine the rest of the implementation so it would be great if I could have some hints on that matter.


Thanks


/*
Use two Terminal windows with 
'nc [IP on local network] -l 1234’ for listening
'nc [IP on local network] 1234’ for sending, then type text and enter. 
Note that lo0 traffic does not seem to appear hence the usage of local network addresses.
*/

override func handleNewFlow(_ flow: NEFilterFlow) -> NEFilterNewFlowVerdict {
     if let f = flow as? NEFilterSocketFlow, let l = f.localEndpoint as? NWHostEndpoint, let r = f.remoteEndpoint as? NWHostEndpoint {
        if let path = Pid.processPathForPid(f.pid), path.contains("/usr/bin/nc") {
           NSLog("\(path) New flow \(f.direction) \(l.hostname):\(l.port)<->\(r.hostname):\(r.port)")
        }
     }
     // Can’t use NEFilterFlowBytesMax below (UInt64 VS Int required)
     return NEFilterNewFlowVerdict.filterDataVerdict(withFilterInbound: true, peekInboundBytes: Int.max, filterOutbound: true, peekOutboundBytes: Int.max)
  }

  override func handleOutboundData(from flow: NEFilterFlow, readBytesStartOffset offset: Int, readBytes: Data) -> NEFilterDataVerdict {
     if let f = flow as? NEFilterSocketFlow, let l = f.localEndpoint as? NWHostEndpoint, let r = f.remoteEndpoint as? NWHostEndpoint {
        if let path = Pid.processPathForPid(f.pid), path.contains("/usr/bin/nc") {
           let headerSize = offset == 0 ? 136 : 0
           NSLog("\(path) \(f.direction) out:\(readBytes.count - headerSize) offset:\(offset) \(l.hostname):\(l.port)<->\(r.hostname):\(r.port)")
        }
     }
     return NEFilterDataVerdict(passBytes: readBytes.count, peekBytes: Int.max)
  }

  override func handleInboundData(from flow: NEFilterFlow, readBytesStartOffset offset: Int, readBytes: Data) -> NEFilterDataVerdict {
     if let f = flow as? NEFilterSocketFlow, let l = f.localEndpoint as? NWHostEndpoint, let r = f.remoteEndpoint as? NWHostEndpoint {
        if let path = Pid.processPathForPid(f.pid), path.contains("/usr/bin/nc") {
           let headerSize = offset == 0 ? 136 : 0
           NSLog("\(path) \(f.direction) in:\(readBytes.count - headerSize) offet:\(offset) \(l.hostname):\(l.port)<->\(r.hostname):\(r.port)")
        }
     }
     return NEFilterDataVerdict(passBytes: readBytes.count, peekBytes: Int.max)
  }

  override func handleOutboundDataComplete(for flow: NEFilterFlow) -> NEFilterDataVerdict {
     if let f = flow as? NEFilterSocketFlow, let l = f.localEndpoint as? NWHostEndpoint, let r = f.remoteEndpoint as? NWHostEndpoint {
        if let path = Pid.processPathForPid(f.pid), path.contains("/usr/bin/nc") {
           NSLog("\(path) \(f.direction) Out complete \(l.hostname):\(l.port)<->\(r.hostname):\(r.port)")
        }
     }
     let verdict = NEFilterDataVerdict.allow()
     verdict.shouldReport = true
     return verdict
  }

  override func handleInboundDataComplete(for flow: NEFilterFlow) -> NEFilterDataVerdict {
     if let f = flow as? NEFilterSocketFlow, let l = f.localEndpoint as? NWHostEndpoint, let r = f.remoteEndpoint as? NWHostEndpoint {
        if let path = Pid.processPathForPid(f.pid), path.contains("/usr/bin/nc") {
           NSLog("\(path) \(f.direction) In complete \(l.hostname):\(l.port)<->\(r.hostname):\(r.port)")
        }
     }
     let verdict = NEFilterDataVerdict.allow()
     verdict.shouldReport = true
     return verdict
  }

  override func handle(_ report: NEFilterReport) {
     if report.event == .flowClosed, let f = report.flow {
        if let f = f as? NEFilterSocketFlow, let l = f.localEndpoint as? NWHostEndpoint, let r = f.remoteEndpoint as? NWHostEndpoint {
           if let path = Pid.processPathForPid(f.pid), path.contains("/usr/bin/nc") {
              NSLog("\(path) \(f.direction) Flow closed \(l.hostname):\(l.port)<->\(r.hostname):\(r.port)")
           }
        }
     }

// An extension has been defined on NEFilterFlow to get the pid from the audit token.

Replies

1. The readBytes object seems to contain an unspecified header, apparently a size followed by something similar to the auditToken, for the first call when offset is 0.

This is surprising. Certainly, this is not how content filters work on iOS.

2. Is this use case of returning a filterData verdict for all packets until the end of the stream an intended use of the API … ?

It’s certainly unusual, but I don’t see anything wrong with it.

3. During the above usage, networking performance takes a heavy hit in the beta when on high speed links (above 100Mbps) …

NE requires traffic to bounce in and out of the kernel [1], so I’d expect to see some sort of performance hit. If you’re seeing an unacceptable performance drop — and that certainly seems to be the case here — I recommend that you file a bug about that.

Please post your bug number, just for the record.

4. Is there some more documentation about the lifecycle of

NEFilterFlow
objects?

I think I must be missing something here. The flow object should identify the flow, so mapping it to your own internal state via a hash table is fine. You can avoid worrying about reuse by clearing out the hash table entry when you’re done with the flow, that is, when you hit EOF or you decide to block it. What am I missing here?

Notwithstanding all of the above, it might be worth playing around with a transparent proxy. I suspect it’ll be a better match for the architecture you inherited from your old socket filter NKE.

Share and Enjoy

Quinn “The Eskimo!”
Apple Developer Relations, Developer Technical Support, Core OS/Hardware

let myEmail = "eskimo" + "1" + "@apple.com"

[1] Keep in mind that our ongoing support for NKEs means that macOS can’t use the user space networking stack that we use on iOS.

I've tried the transparent proxy approach although I have the same issue as reported here https://forums.developer.apple.com/thread/121823 with the save failing with error 'Missing protocol or protocol has invalid type'. I've used a NETunnelProviderProtocol as you've instructed in the replies but the error stays the same, scarce docs don't help much.

Below is the result of dumping the protocol object that caused the error to the logs:


    type = plugin
    identifier = 7AB53020-1331-47DD-B39A-B3A3B90762E7
    serverAddress = localhost
    identityDataImported = NO
    disconnectOnSleep = NO
    disconnectOnIdle = NO
    disconnectOnIdleTimeout = 0
    disconnectOnWake = NO
    disconnectOnWakeTimeout = 0
    disconnectOnUserSwitch = NO
    disconnectOnLogout = NO
    includeAllNetworks = YES
    excludeLocalNetworks = NO
    authenticationMethod = 0
    reassertTimeout = 0
    providerBundleIdentifier = [redacted]

I think you’re getting this error because the system has misidentified you as a Personal VPN app. What entitlements have you set? I’d expect you:

  • To not have

    com.apple.developer.networking.vpn.api
  • To have

    com.apple.developer.networking.networkextension
    with an
    app-proxy-provider
    entry

Share and Enjoy

Quinn “The Eskimo!”
Apple Developer Relations, Developer Technical Support, Core OS/Hardware

let myEmail = "eskimo" + "1" + "@apple.com"

I have been using

com.apple.developer.networking.networkextension
with
app-proxy-provider
for all my testing, as set by Xcode in the entitlements for both the main app and the extension.

This is the same code and project I've used for the content filter / NEFilterDataProvider filtering that enables the SystemExtension, except that I switched the networkextension entitlement to app-proxy-provider using the UI and verified in the .entitlements file and switched the NEProviderClasses of the extension to the appropriate NEAppProxyProvider subclass, as for what I understand is the correct base class to use. Project cleaned and rebuilt to ensure nothing remained cached.

Failure happens in the main app when trying to save the enabled manager to preferences at the same point as the referenced 121823 post. This occurs regardless of the install / enabled state of the SystemExtension as I've tried both cases.

Is there some macOS Transparent Proxy sample code from which I could restart and check my own?

Is there some macOS Transparent Proxy sample code from which I could restart and check my own?

No, alas.

Normally I’d bounce you in to DTS so that I could research this in depth, but DTS is not yet offering official support for 10.15 beta. I’ll see if I can dig something up unofficially.

Share and Enjoy

Quinn “The Eskimo!”
Apple Developer Relations, Developer Technical Support, Core OS/Hardware

let myEmail = "eskimo" + "1" + "@apple.com"

I’ll see if I can dig something up unofficially.

OK. I’ve confirmed that

NETunnelProviderProtocol
is the right option here. Here’s a snippet of code from an internal test project:
let proto = NETunnelProviderProtocol()
proto.serverAddress = "example.com"

let manager = NETransparentProxyManager()
manager.localizedDescription = "Transparent Proxy Test"
manager.protocolConfiguration = proto
manager.isEnabled = true

manager.saveToPreferences { error in
    // … check for error …
}

If that’s not working, there’s probably a packaging problem, although it’s hard to say what that might be.

Share and Enjoy

Quinn “The Eskimo!”
Apple Developer Relations, Developer Technical Support, Core OS/Hardware

let myEmail = "eskimo" + "1" + "@apple.com"

Thanks, the issue was previously calling NETransparentProxyManager.shared() (as in https://forums.developer.apple.com/thread/121823#379646) which is not overridden by NETransparentProxyManager and returns a NEVPNManager instead, which in turn wants a protocol configuration that is not the correct one in this case.


The proxy shows up as a VPN in the network pref pane, starts and all network flows are diverted to it as I setup in the network rules, but I'm certainly missing something obvious after that:

When I try to open a connection from the extension to the initial remote host (either a Network.NWConnection or a NetworkExtension.NWTCPConnection obtained from .createTCPConnection()), it stalls on waiting path (unsatisfied (Path was denied by NECP policy)) in the console with info logs on.

This happens for every connection from the extension after the transparent proxy is started. Remote host and protocol have no influence on the result. The extension sandbox has the client / server entitlements. The only cases I've found about denial by NECP policy were on iOS and don't seem to apply here.

Are there some special conditions for transparent proxy operation (like excluding the extension from the proxied path to avoid a loop, specific entitlements, or a need to hapen ) ?

I don’t know )-:

At this point I’m going to recommend that you open a DTS tech support incident so that I spend a some time researching this (with 10.15 having a GM seed, DTS has started official support for it).

Share and Enjoy

Quinn “The Eskimo!”
Apple Developer Relations, Developer Technical Support, Core OS/Hardware

let myEmail = "eskimo" + "1" + "@apple.com"

Oh hey, I happened to bump into the NE team today and asked them about your issue. It seems that this is a known bug in the 10.15 betas. We’re not entirely sure whether it’s fixed in the 10.15 GM seed that’s currently seeding, but it’s definitely worth re-testing on that.

Share and Enjoy

Quinn “The Eskimo!”
Apple Developer Relations, Developer Technical Support, Core OS/Hardware

let myEmail = "eskimo" + "1" + "@apple.com"

I re-tested on Catalina GM, the issue is the same. As it's also likely to be a configuration issue with my project, I created a DTS request (721826592) so that some time can be devoted to investigating this.

I re-tested on Catalina GM, the issue is the same.

Bummer.

As it's also likely to be a configuration issue with my project, I created a DTS request (721826592) so that some time can be devoted to investigating this.

Thanks. This will probably land on my desk later today.

Share and Enjoy

Quinn “The Eskimo!”
Apple Developer Relations, Developer Technical Support, Core OS/Hardware

let myEmail = "eskimo" + "1" + "@apple.com"

I am also seeing the Missing protocol or protocol has invalid type an error message with the below code. Help me to configure the correct protocol to set for NETransparentProxyManager object.


VCDebug Failed to save the filter configuration: Missing protocol or protocol has invalid type


I am testing this sample code with 10.15.4 Beta version.



func enableFilterConfiguration() {

os_log("VCDebug enableFilterConfiguration() enter")

let appTransparentProxyManager = NETransparentProxyManager.shared()


guard !appTransparentProxyManager.isEnabled else {

registerWithProvider()

return

}

loadFilterConfiguration { success in


guard success else {

self.status = .stopped

return

}

let proto = NETunnelProviderProtocol()

//proto.serverAddress = "127.0.0.1"

appTransparentProxyManager.localizedDescription = "Transparent Proxy Test"

appTransparentProxyManager.protocolConfiguration = proto

appTransparentProxyManager.isEnabled = true

appTransparentProxyManager.saveToPreferences { saveError in

DispatchQueue.main.async {

if let error = saveError {

os_log("VCDebug Failed to save the filter configuration: %@", error.localizedDescription)

self.status = .stopped

return

}


self.registerWithProvider()

}

}

}

}