4 Replies
      Latest reply on Dec 9, 2019 5:14 AM by eskimo
      kennyc Level 1 Level 1 (0 points)

        Given an XPC process, what is the most efficient way to get data back to the host application?

         

        I have an XPC process, primarily written in C++ and a host application written in Swift. It generates a bunch of data that it serializes into an std::vector<std::byte>. When the process is finished, I want to efficiently transfer that buffer of bytes back to the host application.

         

        At the moment, I copy the std::vector data() into an NSData, then encode that NSData into an object conforming to NSSecureCoding that is then sent back to the app. At a minimum this is creating two copies of the data (one in the vector and the other in the NSData) but then I suspect that the XPC transporter might be creating another?

         

        * When using NSData, can I use the bytesNoCopy version if I guarantee that the underlying vector is still alive when I initiate the XPC connection response? When that call returns, am I then free to deallocate the vector even if the NSData is still in-flight back to the main app?

         

        * In one of the WWDC videos, it is recommended to use DispatchData as DispatchData might avoid making a copy when being transported across XPC. Does this apply when using NSXPCConnection or only when using the lower-level C APIs?

         

        * Is there a downside to using DispatchData that might increase the overhead?

         

        * Finally, where does Swift's Data type fit into this? On the application side, I have Swift code that is reading the buffer as a stream of bytes, so I ideally want the buffer to be contiguous and in a format that doesn't require another copy in order for Swift to be able to read it.

         

        (On average, the buffers tend to be small. Maybe only 1-2 megabytes, if that. But occasionally a buffer might ballon to 100-200 megabytes.)

        • Re: Efficiently sending data from an XPC process to the host application.
          john daniel Level 4 Level 4 (520 points)

          It is 2019. Don't worry about "MBs".

          • Re: Efficiently sending data from an XPC process to the host application.
            eskimo Apple Staff Apple Staff (12,675 points)

            This response is kinda long.  Normally I’m not able to dedicate this much time on a DevForums post, but in this case I did some research over the weekend because I was curious how this stuff worked.  I hope to be able to answer basic follow-up questions here, but if we go too far down the rabbit hole I may ask you to open a DTS tech support incident so that I can allocate more time to look at this.


            OK, with that out of the way, let’s tackle your immediate questions:

            When using NSData, can I use the bytesNoCopy version if I guarantee that the underlying vector is still alive when I initiate the XPC connection response? When that call returns, am I then free to deallocate the vector even if the NSData is still in-flight back to the main app?

            No.  There are three different ‘no copy’ methods, none of which support those semantics:

            • -initWithBytesNoCopy:length: will call free on the buffer when NSData is deallocated (that is, its reference count hits zero).

            • -initWithBytesNoCopy:length:deallocator: acts like the above, but calls the custom deallocator you provide.

            • -initWithBytesNoCopy:length:freeWhenDone: acts like the above, but it only calls free if the freeWhenDone parameter is true.

            In all cases, the buffer will (or must) stick around until the reference count hits zero.

            In one of the WWDC videos, it is recommended to use DispatchData as DispatchData might avoid making a copy when being transported across XPC. Does this apply when using NSXPCConnection … ?

            Yes, but the full story is more complex than that.  I’ll going into this more below.

            Is there a downside to using DispatchData that might increase the overhead?

            What sort of overhead are you concerned about?

            Finally, where does Swift's Data type fit into this?

            The key thing to remember about Data is that, as of Swift 5, it has a different underlying model than NSData.  The C dispatch_data_t type is bridged to NSData [1], and thus DispatchData and NSData share the same underlying model, that is, the data is composed of multiple runs of contiguous bytes.  That’s why you have dispatch_data_apply and -[NSData enumerateByteRangesUsingBlock:], both of which let you get at these underlying runs.

            Swift’s Data is simpler: It supports a single contiguous run of bytes [2].  Thus, you have to be very careful when bridging between NSData and Swift.  In my experience, doing a round trip through Data (that is, NSData to Data and back to NSData) without actually touching the data is fine.  But as soon as you try to touch the bytes you run the risk of Data ‘flattening’ the bridged NSData.


            Before we get further into the data case, a quick note about NSXPCConnection and the low-level <xpc/xpc.h> API.  These are closely related and, for the most part, the high-level API supports everything supported by the low-level API.  There is, however, one critical caveat: The low-level API lets you transport more types over the connection.

            For example, imagine you want to create an XPC Service where one part of your code uses NSXPCConnection and another part uses the low-level API.  Historically this was tricky because a) an XPC Service can only register a single service, and b) there’s no way to transport an xpc_endpoint_t over NSXPCConnection and there’s no way to transport an NSXPCListenerEndpoint over xpc_connection_t.

            This has been resolved in 10.15.  There we added -[NSXPCInterface setXPCType:forSelector:argumentIndex:ofReply:], which allows you to transport arbitrary XPC objects over NSXPCConnection.  Yay!

            Note The availability macros on that method indicate that it’s available since 10.14.  That’s not my experience, and I’ve filed a bug to get that corrected (r. 57736296).

            Historically this use to crop up when folks were trying to transport an IOSurface.  That API has an IOSurfaceCreateXPCObject routine that returns a low-level XPC object that represents the surface, but you couldn’t transport that over an NSXPCConnection.  However, that specific problem got resolved on 10.12 where we introduced a new Objective-C IOSurface object, and that object is transportable directly over NSXPCConnection.  So double yay!


            Coming back to the problem of transporting data efficiently, there are two paths to consider:

            • You can share data implicitly by sending a data object.

            • You can share data explicitly using a shared memory object.

            I’ll discuss each in turn below, albeit in reverse (-:


            Both NSXPCConnection and the low-level XPC API let you explicitly transport shared memory objects over the connection.  The low-level API supports this explicitly via a shared memory object (XPC_TYPE_SHMEM), create using xpc_shmem_create.  In contrast, for NSXPCConnection you must created a POSIX shared memory object (using shm_open man page), wrap the resulting file descriptor into an NSFileHandle, and then pass that over the XPC connection.

            Note You can actually use the latter technique with the low-level API as well, using an XPC_TYPE_FD object created using xpc_fd_create.  I can’t see any advantage of doing that, but there’s probably some subtlety I’ve missed.

            Overall, I can’t help but think that this might be the best option for you.  That is, set up a pool of shared memory regions and then just include the region ID in the XPC message.  It’s hard to imagine any other approach having a lower overhead.

            Of course shared memory raises both security and correctness issues.  Given that this is an app-specific XPC Service, I don’t think security is a big concern.  However, correctness is always a challenge.  Specifically, you have to prevent the XPC Service from modifying the buffer while the client is still using it.


            On the implicitly shared front, XPC leans heavily on Mach messages, which has support for both inline memory passing for small buffers and out-of-line memory passing for large buffers.  The exact cutoff is not documented, but I believe it’s just under 16 KiB, which is way smaller than the buffers you’re using (for those reading along at home, this thread originated on Swift Forums, which has more info on kennyc’s requirements).

            Passing data out-of-line is more or less automatic.  However, you’ll want to make sure that the data is in its own memory region.  You can do this in a variety of ways.  The option I got working was as follows:

            1. Call mmap to allocate the memory.

            2. Wrap that in a DispatchData using init(bytesNoCopy:deallocator:), passing .unmap to the deallocator parameter.

              Note This pattern is based on the comments in the xpc_objects man page.  My digging suggests that this isn’t an absolute requirement — there are other ways to get memory that’s guaranteed to be in its own region — but it’s what I tried first and it worked.

            3. Cast that to an NSData.

              Note I couldn’t find a way to do that in Swift, so I bounced over to Objective-C to do it.

            4. Coerce that to a `Data.

            5. Send that via NSXPCConnection.

            6. On the receive side, coerce the Data to an NSData.

            7. Work with the data.

            I tested this using the code at the end of this response.  Specifically, I created a new command-line tool target, added this code to it, and had the main entry point instantiate XPCDataTest and call run.  This was using Xcode 11.2 on macOS 10.14.6, but I think it’ll work the same on 10.12 or later.

            There’s a couple of things to note about this code:

            • It uses an anonymous listener so that I can do all the work in a single process.  This is a really useful technique to remember when debugging and testing XPC.

            • The data I send over is a memory region that represents a memory mapped file.  I did this because it makes it easy to confirm that the data went across without being copied.  Specifically, if you set a breakpoint on the line that logs done, you’ll see output like this:

              XPCDataTest[66317:6721193] base: 0x0000000106000000
              XPCDataTest[66317:6721639] …
              XPCDataTest[66317:6721639]… bytes: 0x0000000106f22000

              base is the address of the buffer on the send size and bytes in the address on the receive side.  You can then run vmmap against that process:

              $ vmmap -interleaved 66317
              …
              mapped file 0000000106000000-0000000106f22000 … /System/Library/Kernels/kernel
              mapped file 0000000106f22000-0000000107e43000 … /System/Library/Kernels/kernel
              ……

              Note how both addresses reference the same memory mapped file, and thus the memory went across without a copy.

            Share and Enjoy

            Quinn “The Eskimo!”
            Apple Developer Relations, Developer Technical Support, Core OS/Hardware
            let myEmail = "eskimo" + "1" + "@apple.com"

            [1] That is, you can treat any dispatch_data_t as an NSData (on modern systems).  There is no bridging in the other direction.

            [2] This is as of Swift 5.  Earlier incarnations of Swift tried to support the NSData model.


            @objc
            protocol XPCTest {
                func sendData(_ data: Data)
            }
            
            class XPCDataTest: NSObject, NSXPCListenerDelegate, XPCTest {
            
                let listener = NSXPCListener.anonymous()
            
                var connection: NSXPCConnection? = nil
            
                func run() {
            
                    // Set up the listener.
            
                    self.listener.delegate = self
                    listener.resume()
            
                    // Create and set up a client connect to that listener.
            
                    let connection = NSXPCConnection(listenerEndpoint: self.listener.endpoint)
                    self.connection = connection
            
                    connection.interruptionHandler = { NSLog("interruption") }
                    connection.invalidationHandler = { NSLog("invalidation") }
            
                    connection.resume()
            
                    connection.remoteObjectInterface = NSXPCInterface(with: XPCTest.self)
                    let p = connection.remoteObjectProxy as! XPCTest
            
                    // Send a data value over that connection.  I'm sending a region
                    // occupied by a memory mapped file, which isn’t what you’d do normally.
                    // I did this here because it allows me to confirm the transfer, as I’ve
                    // explained in the text above.
            
                    let fd = open("/System/Library/Kernels/kernel", O_RDONLY)
                    assert(fd >= 0)
            
                    // 15867112 is the length of the file on my machine; obviously this a huge hack (-:
                    let base = mmap(nil, 15867112, PROT_READ, MAP_FILE | MAP_PRIVATE, fd, 0)!
                    assert(base != MAP_FAILED)
                    NSLog("base: %@", "\(base)")
            
                    let success = close(fd) >= 0
                    assert(success)
            
                    let buffer = UnsafeRawBufferPointer(start: base, count: 15867112)
                    let d = DispatchData(bytesNoCopy: buffer, deallocator: DispatchData.Deallocator.unmap)
                    // `Hack` is an Objective-C class that has a `+dataForDispatchData:` method.
                    let dd = Hack.data(forDispatchData: d as __DispatchData)
            
                    p.sendData(dd)
            
                    dispatchMain()
                }
            
                var listenerConnection: NSXPCConnection? = nil
            
                func listener(_ listener: NSXPCListener, shouldAcceptNewConnection newConnection: NSXPCConnection) -> Bool {
                    // Reject anything except the first connection.
                    guard self.listenerConnection == nil else { return false }
                    // Set this up as a ‘server-side’ connection.
                    self.listenerConnection = newConnection
                    newConnection.exportedInterface = NSXPCInterface(with: XPCTest.self)
                    newConnection.exportedObject = self
                    newConnection.resume()
                    return true
                }
            
                @objc
                func sendData(_ data: Data) {
                    NSLog("send data, count: %zd", data.count)
                    let nsd = data as NSData
                    NSLog("bytes: %@", "\(nsd.bytes)")
                    NSLog("done")
                }
            }
              • Re: Efficiently sending data from an XPC process to the host application.
                eskimo Apple Staff Apple Staff (12,675 points)

                Oh, one other thing.  When talking about explicit sharing I wrote:

                Specifically, you have to prevent the XPC Service from modifying the buffer while the client is still using it.

                This isn’t a problem with the implicit approach because the client gets its own (virtual) copy of the data, and the XPC Service free its virtual copy of the data as soon as its done with it.  Oh, and the client’s copy is copy-on-write, so there’s no danger of client changes being seen by the XPC Service.

                With regards the last point, if you’re explicitly sharing the data you can prevent the client modifying the data by remapping it read-only and then sending that read-only mapping to the client.

                That does not help with the synchronisation problem though.

                Share and Enjoy

                Quinn “The Eskimo!”
                Apple Developer Relations, Developer Technical Support, Core OS/Hardware
                let myEmail = "eskimo" + "1" + "@apple.com"