Filtering DNS packets via Network Extension

Hello, we are working on a feature that sometimes needs to drop/postpone DNS requests except for requests for allowed domains or originating from allowed executables. In this product we already have a network extension Socket Filter present. We were looking for correct technology to use for this purpose.

1), socket filter can not be used, as it is not possible to selectively drop only some UDP packets. If we block one flow, the socket can no longer be used to communicate with the specific remote IP and fails with EPIPE. This leads to DNS resolving issues as not every software can cope with UDP not working for part of the remote servers (AFAIK including Apple's own mDNSresponder).

2), loading socket filter drops all active connections. This is understandable for firewall type API, but it is an extremely inconvenient behavior if we do not want to actually implement a firewall. There is currently no option to configure this behavior of socket filter.

3), using DNS proxy is not really feasible, as it is not a transparent proxy and only a single DNS proxy can run at any given time. In case of multiple registered DNS proxies only the last one stays running. It is also a pretty heavyweight solution for what we want to accomplish.

We tried using BSD pf packet filter with user specific rules: "pass out quick from any to any keep state user nameofuser" but 4) this breaks if DNS proxy is present. Attribution of proxied flows is not visible to BSD pf packet filter which sees all traffic as attributed to the DNS proxy instead of the original process. This has been reported as working as intended by Apple.

The only other solution seems to be a packet level filter. But here we hit other issues:

5), since order of processing is Socket Filter —> DNS proxy -> VPN -> packet filter, we can not guarantee all traffic will be filtered (packet filter does not see at least some VPN traffic on MacOS, did not test this with all VPN types though...)

6). the NEFilterPacketProvider does not seem to have a way how to attribute the packet to a process. While the NEPacket obtained via delayCurrentPacket() has a metadata member, it seems to be never present on MacOS (at least on Monterey). This prevents per app/process/binary filtering and leaves only packet content inspection as an option. While it may in theory be possible to use Socket Filter to attribute packet level addresses to processes, it seems pretty cumbersome and potentially fragile in case DNS proxy is used.

7), there is an issue with coexistence of Packet and Socket filter. It seems that any change of NEFilterManager’s configuration concerning packet filter causes brief stop and start of socket filter as well. This is extremely inconvenient because a socket filter reload subsequently leads to the drop of all connections on the system. Please note that the packet filter does not cause such drop of connections on loading, so it would be ideal for our purpose. The only workaround is to have multiple system extensions, which is actually a correct engineering approach, but that leads to a horrible user experience. Allowing multiple system extensions is far from streamlined for the average user, and he would need to also allow one filter after each other separately. If we were to use socket filter, DNS proxy, VPN and packet filter in a single product, which ideally should each reside in standalone system extensions for resilience, the user would need to allow 8 separate dialogs!! And adding feature during lifetime of the product should not lead to repeated requests to allow system extensions; this is a nightmare from administration point of view. It should really be a once per app action (at least the load of system extension). But we are getting sidetracked. Coexistence seems like the most feasible user centric solution, but it is not really possible with the combination of Socket and Packet filters.

Am I missing something or is the only possible solution to use Packet Filter extension (a second one, to not interfere with the Socket Filter one) and filtering based on packet content, which only works for wifi/ethernet interfaces?

Replies

the NEFilterPacketProvider does not seem to have a way how to attribute the packet to a process.

Correct, I would not count on this data being available and it would be a performance hit to look it up on every packet if it was.

Thank you for the detailed response and from including the information based on the research that you have done thus far. It sounds like your main high-level goals are to make sure that you are performing content filter actions on DNS traffic. For this, I agree that you should start with something basic and not include many Network System Extensions if you do not have to. One question here to start things off:

You mentioned:

loading socket filter drops all active connections. This is understandable for firewall type API, but it is an extremely inconvenient behavior if we do not want to actually implement a firewall. There is currently no option to configure this behavior of socket filter.

Assuming that you are referring to NEFilterDataProvider here, yes, it will drop active connections up front when the provider starts, but if the connections are retried then it should work. This should also allow you to target UDP port 53 on the system to make filtering decisions about UDP traffic based on an app. Using this provider alone can you talk more about what is happening here:

If we block one flow, the socket can no longer be used to communicate with the specific remote IP and fails with EPIPE

With any rules or code examples as well.

Concerning "If we block one flow, the socket can no longer be used to communicate with the specific remote IP and fails with EPIPE":

Socket Proxy generates new NEFilterFlow for every "pseudo connection"; it seems this means that you get one flow for every "local ip:local port: remote ip: remote port" tuple. In other words, every UDP BSD socket generates as many flows as is the number of "remote ip:remote port" targets it actually contacts.

Every such flow can be left in undecided state, you can inspect the traffic etc. But the moment we .drop it, communication with the target becomes impossible from that socket from now on (sendmsg() to it fails with error). Please note traffic to other targets is still working fine.

This behaviour is sufficiently weird that most of the apps can not cope with this. I am pretty sure this breaks mDNSResponder process also if you do it to it. Rather than recreating the socket on such error, it simply ignores it. This may lead to weird states like half of dns works because one dns server can be contacted from the socket but half not because the other one was rendered unusable. I have not double checked this mDNSResponder behaviour so please note that I may be wrong (I did analyse it quite a long time ago). But for my test app it behaves like this. Even fixing mDNSResponder does not fix the issue as apps can use their own messaging (dig, libresolv, other 3rd parties) and they will not be ready for this. In general people do not really seem to handle weird errors of UDP sendmsg.

IMO for selective filtering of UDP, Socket Filter is not a feasible approach.

Okay, thank you for explaining that point more in-depth. I'm wondering if something might be wrong here with your setup in NEFilterDataProvider. As a quick test I setup a general UDP outbound filtering rule with the default action of NEFilterActionFilterData:

NWHostEndpoint *udpHostEndpoint = [NWHostEndpoint endpointWithHostname:@"0.0.0.0" port: @"53"];
NENetworkRule *anyHostAndPortUDPRule = [[NENetworkRule alloc] initWithRemoteNetwork: udpHostEndpoint
                                                                     remotePrefix: 0
                                                                     localNetwork: nil
                                                                      localPrefix: 0
                                                                         protocol: NENetworkRuleProtocolUDP
                                                                        direction: NETrafficDirectionAny];

NEFilterRule *udpFilterRule = [[NEFilterRule alloc] initWithNetworkRule: anyHostAndPortUDPRule action: NEFilterActionFilterData];
NEFilterSettings *filterSettings = [[NEFilterSettings alloc] initWithRules:@[udpFilterRule] defaultAction: NEFilterActionFilterData];

Next, in handleNewFlow I dropped all UDP port 53: DNS traffic that came through for the first 15 flows. That means traffic from com.apple.mail and com.apple.mDNSResponder for port 53 was just dropped on the floor.

After the first 15 flows were dropped, I then switched to returning a filter verdict for filterDataVerdictWithFilterInbound and then future traffic was evaluated and then resolved. It was not dropped on the floor anymore. As a result, com.apple.mail and com.apple.mDNSResponder traffic started working again:

2022-06-17 09:12:30.936657-0700 0x23772e   Default     0x0                  62146  0    com.example.apple-samplecode.FilterTestBed.FilterDataProvider: [com.example.apple-samplecode.FilterTestBed.FilterDataProvider:data_provider] handleOutboundDataFromFlow packet data:  m a i l \^A g \^E a p p l e \^C c o m

022-06-17 09:12:30.956581-0700 0x23772e   Default     0x0                  62146  0    com.example.apple-samplecode.FilterTestBed.FilterDataProvider: [com.example.apple-samplecode.FilterTestBed.FilterDataProvider:data_provider] handleInboundDataCompleteForFlow 
        identifier = D89B5B5D-793C-4940-C09D-6B0143132C00
        sourceAppIdentifier = .com.apple.mail

Double check your filtering rules.