Crash when using String(cString:)

Hi, I am currently dealing with a crash when converting UnsafePointer<CChar>!to String in Swift. The pointer comes from the type es_string_token_t which my app receives from the Endpoint Security framework.

This is what my code looks like:

extension es_string_token_t {
  var description: String {
    if self.data != nil && self.length > 0 {
      return String(cString: self.data)
    }
    return ""
  }
}

And it produces the following crash:

Thread 4 Crashed::  Dispatch queue: com.apple.root.default-qos
0   libsystem_platform.dylib      	       0x18bd44864 _platform_strlen + 4
1   libswiftCore.dylib            	       0x198f3a3c0 String.init(cString:) + 32
2   com.company.app.App  	       0x10456aac0 0x104564000 + 27328
3   com.company.app.App  	       0x10456f768 0x104564000 + 46952
4   com.company.app.App  	       0x1045793d8 0x104564000 + 87000
5   com.company.app.App  	       0x10457e8f8 0x104564000 + 108792
6   com.company.app.App  	       0x10458758c 0x104564000 + 144780
7   libdispatch.dylib             	       0x18bb6a5f0 _dispatch_call_block_and_release + 32
8   libdispatch.dylib             	       0x18bb6c1b4 _dispatch_client_callout + 20
9   libdispatch.dylib             	       0x18bb7da04 _dispatch_root_queue_drain + 680
10  libdispatch.dylib             	       0x18bb7e104 _dispatch_worker_thread2 + 164
11  libsystem_pthread.dylib       	       0x18bd2c324 _pthread_wqthread + 228
12  libsystem_pthread.dylib       	       0x18bd2b080 start_wqthread + 8

My app is deployed on arround 13k macs and only some of them experience this crash which I havent been able to reproduce. Any help would be appreciated.

Answered by santalvarez in 722984022

I found a solution similar to what Quinn suggested on Google's Project Santa's github. I just ported it to swift. Here it is:

let deadline = DispatchTime(uptimeNanoseconds: cMsg.pointee.deadline) - .seconds(2)

let handlerSemaphore = DispatchSemaphore(value: 0)
handlerSemaphore.signal()

let deadlineSemaphore = DispatchSemaphore(value: 0)

queue.asyncAfter(deadline: deadline) {
    // Check if event handler has already responded
    if handlerSemaphore.wait(timeout: .now()) == .timedOut {
        // Event handler already responded so exit
        return
    }

    // Deadline met, event handler still processing, default to allow
    self.allowMessage(cMsg)
    deadlineSemaphore.signal()
}

queue.async {
    // Process auth event and respond
    handleAuthEvent(cMsg)

    // Check if deadline was met
    if handlerSemaphore.wait(timeout: .now()) == .timedOut {
        // Wait for deadline block to allow the message, then free
        deadlineSemaphore.wait()
    }
    self.freeMessage(cMsg)
}

Your String init assumes that the data is encoded using UTF-8.
Could it be that in some cases, the cString is not encoded using UTF-8?
(You can specify the encoding to use, when creating your String.)

Crashing in a strlen-like function sounds like the string may not be null-terminated. That would be a bug (since the documentation for es_string_token_t explicitly says the string is null-terminated) but something to keep in mind. Assuming this remains impossible to repro locally, any chance you could (say) sanity-check or log the value of length? Or just call strlen() directly on the string pointer first and see if that crashes with the same frequency as the current String initializer.

Are you sure you’re managing the lifetime of the es_string_token_t correctly?

Most es_string_token_t values come from es_message_t values, and by default those are only valid for the duration of your handler. Consider this snippet:

let queue: DispatchQueue = … some queue …

let result = es_new_client(&client) { client, message in
    … message is valid here …
    queue.async {
        … but not valid here …
    }
}

If you accessed a string embedded in the message at the second point, the message might have been deallocated and you might see a crash like this.

The solution is to either do the work synchronously within your handler or copy the message like so:

let result = es_new_client(&client) { client, message in
    let message = es_copy_message(message)
    queue.async {
        … message is valid here …
        es_free_message(message)
    }
}

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

Ok I found out what is happening but I need help fixing it. Look at this code

// Inside function that handles process exec messages
    guard let copiedMessage = copyMessage(msg) else {
         es_respond_auth_result(esClient.client!, msg, ES_AUTH_RESULT_ALLOW, false)
         return
    }

    DispatchQueue.global().async {
        var process: ProcessDetails? = nil
        let deadline = copiedMessage.pointee.deadline
        let semaphore = DispatchSemaphore(value: 0)

        DispatchQueue.global().async {
            // This may take some time
            process = ProcessDetails(process: copiedMessage.pointee.event.exec.target.pointee)
            semaphore.signal()
      }

      _ = semaphore.wait(timeout: DispatchTime(uptimeNanoseconds: deadline - (2 * NSEC_PER_SEC)))

      es_respond_auth_result(...)

      freeMessage(copiedMessage)
    }

I am calling description inside my ProcessDetails class, the problem is that if the timeout of the semaphore runs out I will free the copiedMessage but the initialization of ProcessDetails is still going so that results in a crash. How could I kill that seconds thread before freeing the message? Or should I somehow set another semaphore and wait for that thread to finish (even though I already responded to the message) to free the message?

So what are you actually building here? The fact that you fail insecure right at the top — that is, if copyMessage(…) fails you allow the operation — suggests that you’re not building an ES product, because those tend to fail securely. And there’s no indication of how you get the result from your inner nested blocks to es_respond_auth_result call, which is further evidence that you’re not actually authenticating.

Is the goal here simply to log these events? Or are you actually authenticating?

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

As per Quinns request here is a more detailed version of my code

static func handleProcessExec(_ msg: UnsafePointer<es_message_t>) {
    let target = msg.pointee.event.exec.target.pointee

    guard msg.pointee.process.pointee.is_es_client == false else {
      es_respond_auth_result(esClient.client!, msg, ES_AUTH_RESULT_ALLOW, true)
      return
    }

    // copy message so it can be used on another thread
    guard let copiedMessage = copyMessage(msg) else {
      es_respond_auth_result(esClient.client!, msg, ES_AUTH_RESULT_DENY, false)
      return
    }

    DispatchQueue.global().async {
        var process: ProcessDetails? = nil
        let deadline = DispatchTime(uptimeNanoseconds: copiedMessage.pointee.deadline - (2*NSEC_PER_SEC))
        let semaphore = DispatchSemaphore(value: 0)
        // RuleResult contains what rule matched and if we should allow/deny
        var result = RuleResult()
        let timestamp = copiedMessage.pointee.time.tv_sec

        DispatchQueue.global().async {
            process = ProcessDetails(process: copiedMessage.pointee.event.exec.target.pointee)
            ESEventRulesRunner.runProcessExecRules(process: process!, &result)

            semaphore.signal()
         }

      _ = semaphore.wait(timeout: deadline)

      es_respond_auth_result(esClient.client!, copiedMessage, result.verdict, false)

      if result.verdict == ES_AUTH_RESULT_DENY {
        // Log event to api
        let event = ESEvent(process: process!, timestamp: timestamp, ruleType: result.type)
        Api.postEvent(event: event)
       }

      freeMessage(copiedMessage)
    }
     
}

The code has serious conceptual problems. To start, don’t use the Dispatch global concurrent queue for this work. Using those queues is almost always a mistake. My Avoid Dispatch Global Concurrent Queues post has links to WWDC talks that explain why.

Second, using a semaphore like this is also almost always a mistake. Semaphores don’t support the concept of ownership and thus don’t benefit from the system’s priority inversion avoidance.

Beyond that, I’m not sure why you’re using a semaphore for this. You are always copying the message so you don’t need to respond with a result before returning from the handler. So why block your thread? If you want to implement a timeout mechanism, do it asynchronously using a timer.

As to your crasher, the solution here is to recognise that there’s no hard requirement that the message lifetime be matched to the authorisation decision. If both of your code paths — the one doing the work and the one doing the timeout — reference the message, use a ref count to make sure sure that you only call freeMessage(_:) when they’re both done.

Finally, I question the utility of this timeout in general. An ES client has to be able to hit its real time goal every time. If it does that, the timeout isn’t useful. And if it doesn’t do that, then you have bigger problems such that the timeout is unlikely to help.

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

Accepted Answer

I found a solution similar to what Quinn suggested on Google's Project Santa's github. I just ported it to swift. Here it is:

let deadline = DispatchTime(uptimeNanoseconds: cMsg.pointee.deadline) - .seconds(2)

let handlerSemaphore = DispatchSemaphore(value: 0)
handlerSemaphore.signal()

let deadlineSemaphore = DispatchSemaphore(value: 0)

queue.asyncAfter(deadline: deadline) {
    // Check if event handler has already responded
    if handlerSemaphore.wait(timeout: .now()) == .timedOut {
        // Event handler already responded so exit
        return
    }

    // Deadline met, event handler still processing, default to allow
    self.allowMessage(cMsg)
    deadlineSemaphore.signal()
}

queue.async {
    // Process auth event and respond
    handleAuthEvent(cMsg)

    // Check if deadline was met
    if handlerSemaphore.wait(timeout: .now()) == .timedOut {
        // Wait for deadline block to allow the message, then free
        deadlineSemaphore.wait()
    }
    self.freeMessage(cMsg)
}

I don't understand why you question the use of the timeout

Because you’re building an ES product, where you have to supply an answer promptly because there’s no good default answer:

  • You can’t fail insecurely because that’s… well… insecure.

  • But if you fail securely, random stuff on the Mac is gonna start failing mysteriously.

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

Crash when using String(cString:)
 
 
Q