Process arguments from audit token

Hi, how could I get the command line arguments of a process given its audit token.

My app is a Content Filter Network Extension written in swift. I can obtain the audit token from NEFilterFlow but I can't figure out how to get the process arguments, I was able to get the pid from the audit token using audit_token_to_pid.

Answered by DTS Engineer in 678375022

This works fine, the only thing I have yet to see is if it has any memory leaks.

Your call to free is both necessary and sufficient, but I found it a bit weird because I usually pair allocators. That is, if I allocate with malloc I free with free but if I allocate with UnsafeMutablePointer I then free with its deallocate() method.

If you have any suggestions to the func please let me know.

At a the ‘API’ level, it’s weird that you return NSString rather than String.

As far as the implementation is concerned, it’s full of unsafe pointer manipulation that’s… well… very unsafe. When I’m parsing untrusted data — and remember that this is untrusted, in that the remote process can modify the data in any way — I prefer to build a parser that’s more paranoid.

To start, I’d split the code for getting the process’s argument memory block off from the code that parses it. This has a number of benefits:

  • It let’s me test the parsing code in isolation.

  • If the technique for getting the memory block changes, I can adapt to that without changing the parser.

  • It isolates the unsafe code.

So, here’s my code for getting the memory block:

func argumentData(for pid: pid_t) throws -> Data {
    // There should be a better way to get a process’s arguments
    // (FB9149624) but right now you have to use `KERN_PROCARGS2`
    // and then parse the results.
    var argMax: CInt = 0
    var argMaxSize = size_t(MemoryLayout.size(ofValue: argMax))
    let err = sysctlbyname("kern.argmax", &argMax, &argMaxSize, nil, 0)
    guard err >= 0 else {
        throw System.Errno(rawValue: errno)
    }
    precondition(argMaxSize != 0)
    var result = Data(count: Int(argMax))
    let resultSize = try result.withUnsafeMutableBytes { buf -> Int in
        var mib: [CInt] = [
            CTL_KERN,
            KERN_PROCARGS2,
            pid
        ]
        var bufSize = buf.count
        let err = sysctl(&mib, CUnsignedInt(mib.count), buf.baseAddress!, &bufSize, nil, 0)
        guard err >= 0 else {
            throw System.Errno(rawValue: errno)
        }
        return bufSize
    }
    result = result.prefix(resultSize)
    return result
}

Note that the only unsafe code here is the code that has to be unsafe because I’m calling sysctlbyname and sysctl.

And here’s my code for parsing:

func argumentsFromArgumentData(_ data: Data) throws -> [String] {

    // The algorithm here was was ‘stolen’ from the Darwin source for `ps`.
    //
    // <https://opensource.apple.com/source/adv_cmds/adv_cmds-176/ps/print.c.auto.html>
    
    // Parse `argc`.  We’re assuming the value is little endian here, which is
    // currently accurate but it could be a problem if we’ve “gone back to
    // metric”.
    
    var remaining = data[...]
    guard remaining.count >= 6 else {
        throw ParseError.unexpectedEnd
    }
    let count32 = remaining.prefix(4).reversed().reduce(0, { $0 << 8 | UInt32($1) })
    remaining = remaining.dropFirst(4)

    // Skip the saved executable path.
    
    remaining = remaining.drop(while: { $0 != 0 })
    remaining = remaining.drop(while: { $0 == 0 })

    // Now parse `argv[0]` through `argv[argc - 1]`.

    var result: [String] = []
    for _ in 0..<count32 {
        let argBytes = remaining.prefix(while: { $0 != 0 })
        guard let arg = String(bytes: argBytes, encoding: .utf8) else {
            throw ParseError.argumentIsNotUTF8
        }
        result.append(arg)
        remaining = remaining.dropFirst(argBytes.count)
        guard remaining.count != 0 else {
            throw ParseError.unexpectedEnd
        }
        remaining = remaining.dropFirst()
    }
    return result
}

enum ParseError: Error {
    case unexpectedEnd
    case argumentIsNotUTF8
}

There’s nothing unsafe here, so the worst it can do is return the wrong results. Which is always going to be a risk anyway because this data is under the control of the target process.

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

What do you intend to do with this information?

The reason I ask is that the system doesn’t persist a process’s arguments in any way that could be considered secure. Rather, when you run a child process the system copies the arguments you supply to the child process’s address space. This goes into a read/write area that the child process can modify. So, if you get the process’s arguments (using, say, ps ajxww [1]) the value you see is under the process’s control. This may be sufficient for your needs, but you must not make security decisions based on it.

Oh, btw, I’m assuming your targeting the Mac here.

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

[1] Although you shouldn’t run ps from your NE provider! There are better ways, which I’ll go into once we’ve covered the ground rules.

I’m not planning on making security decisions based on this info

Good to hear!

My go-to API for this sort of thing is <libproc.h> and I just assumed that it would have support for getting this info. Alas, it does not. That’s most frustrating.

That means you have to fall back to the way that ps works, namely using the KERN_PROCARGS (well, KERN_PROCARGS2) sysctl to copy the arguments area out of the process and then grovel through that. This is not fun. You can look at the source code for ps for the details (search for KERN_PROCARGS2 in this file).

Honestly, running ps seems like not such a bad option at this point )-:

Regardless of what you do here, I encourage you to file an enhancement request against <libproc.h> for it to support this directly. Please post your bug number, just for the record.

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

I submitted an enhancement request as you suggested, FB9149624.

Thanks!

I am not sure I selected the correct category

I just took a look and it’s landed in roughly the right place (close enough that it’ll definitely find its way to the right place :–).

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

Ok so here is what I came up with. This works fine, the only thing I have yet to see is if it has any memory leaks. I hope this helps someone in the future 😃 If you have any suggestions to the func please let me know.

Also, you should first convert the audit_token to pid using the function mentioned above.

func getArgs(from pid: Int32) -> [NSString]? {

  var arguments: [NSString] = []

  var mib: [Int32] = [0, 0, 0]
  var argsMax: Int = 0

  mib[0] = CTL_KERN
  mib[1] = KERN_ARGMAX

  var size = MemoryLayout<Int>.stride(ofValue: argsMax)
  if sysctl(&mib, 2, &argsMax, &size, nil, 0) == -1 {
    return nil
  }

  let processArgs = UnsafeMutablePointer<CChar>.allocate(capacity: argsMax)

  mib[0] = CTL_KERN
  mib[1] = KERN_PROCARGS2
  mib[2] = pid

	size = argsMax as size_t

  // Get process arguments
  if sysctl(&mib, 3, processArgs, &size, nil, 0) == -1 {
    return nil
  }

  if size <= MemoryLayout<Int>.size {
    return nil
  }

  var numberOfArgs: Int32 = 0

  //Get number of args
  memcpy(&numberOfArgs, processArgs, MemoryLayout.size(ofValue: numberOfArgs))

	// Initialize the pointer to the start of args
  var parser: UnsafeMutablePointer<CChar> = processArgs + MemoryLayout.size(ofValue: numberOfArgs)

	// Iterate until NULL terminated path
  while parser < &processArgs[size] {
    if 0x0 == parser.pointee {
      // arrived ar argv[0]
      break
    }
    parser += 1
  }

  // sanity check
  if parser == &processArgs[size] {
    return nil
  }

  while parser < &processArgs[size] {
    if 0x0 != parser.pointee {
      break
    }
    parser += 1
  }

  // sanity check
  if parser == &processArgs[size] {
    return nil
  }

  var argStart: UnsafeMutablePointer<CChar>? = parser
  // Get all args
  while parser < &processArgs[size] {

    if parser.pointee == CChar(0) {

      if nil != argStart {
        let argument = NSString(utf8String: argStart!)

        if argument != nil {
          arguments.append(argument!)
        }
      }

      argStart = parser + 1

      if arguments.count == numberOfArgs {
        break
      }
    }
    parser += 1
  }

  // Is this free necessary?
  free(processArgs)
  return arguments
}
Accepted Answer

This works fine, the only thing I have yet to see is if it has any memory leaks.

Your call to free is both necessary and sufficient, but I found it a bit weird because I usually pair allocators. That is, if I allocate with malloc I free with free but if I allocate with UnsafeMutablePointer I then free with its deallocate() method.

If you have any suggestions to the func please let me know.

At a the ‘API’ level, it’s weird that you return NSString rather than String.

As far as the implementation is concerned, it’s full of unsafe pointer manipulation that’s… well… very unsafe. When I’m parsing untrusted data — and remember that this is untrusted, in that the remote process can modify the data in any way — I prefer to build a parser that’s more paranoid.

To start, I’d split the code for getting the process’s argument memory block off from the code that parses it. This has a number of benefits:

  • It let’s me test the parsing code in isolation.

  • If the technique for getting the memory block changes, I can adapt to that without changing the parser.

  • It isolates the unsafe code.

So, here’s my code for getting the memory block:

func argumentData(for pid: pid_t) throws -> Data {
    // There should be a better way to get a process’s arguments
    // (FB9149624) but right now you have to use `KERN_PROCARGS2`
    // and then parse the results.
    var argMax: CInt = 0
    var argMaxSize = size_t(MemoryLayout.size(ofValue: argMax))
    let err = sysctlbyname("kern.argmax", &argMax, &argMaxSize, nil, 0)
    guard err >= 0 else {
        throw System.Errno(rawValue: errno)
    }
    precondition(argMaxSize != 0)
    var result = Data(count: Int(argMax))
    let resultSize = try result.withUnsafeMutableBytes { buf -> Int in
        var mib: [CInt] = [
            CTL_KERN,
            KERN_PROCARGS2,
            pid
        ]
        var bufSize = buf.count
        let err = sysctl(&mib, CUnsignedInt(mib.count), buf.baseAddress!, &bufSize, nil, 0)
        guard err >= 0 else {
            throw System.Errno(rawValue: errno)
        }
        return bufSize
    }
    result = result.prefix(resultSize)
    return result
}

Note that the only unsafe code here is the code that has to be unsafe because I’m calling sysctlbyname and sysctl.

And here’s my code for parsing:

func argumentsFromArgumentData(_ data: Data) throws -> [String] {

    // The algorithm here was was ‘stolen’ from the Darwin source for `ps`.
    //
    // <https://opensource.apple.com/source/adv_cmds/adv_cmds-176/ps/print.c.auto.html>
    
    // Parse `argc`.  We’re assuming the value is little endian here, which is
    // currently accurate but it could be a problem if we’ve “gone back to
    // metric”.
    
    var remaining = data[...]
    guard remaining.count >= 6 else {
        throw ParseError.unexpectedEnd
    }
    let count32 = remaining.prefix(4).reversed().reduce(0, { $0 << 8 | UInt32($1) })
    remaining = remaining.dropFirst(4)

    // Skip the saved executable path.
    
    remaining = remaining.drop(while: { $0 != 0 })
    remaining = remaining.drop(while: { $0 == 0 })

    // Now parse `argv[0]` through `argv[argc - 1]`.

    var result: [String] = []
    for _ in 0..<count32 {
        let argBytes = remaining.prefix(while: { $0 != 0 })
        guard let arg = String(bytes: argBytes, encoding: .utf8) else {
            throw ParseError.argumentIsNotUTF8
        }
        result.append(arg)
        remaining = remaining.dropFirst(argBytes.count)
        guard remaining.count != 0 else {
            throw ParseError.unexpectedEnd
        }
        remaining = remaining.dropFirst()
    }
    return result
}

enum ParseError: Error {
    case unexpectedEnd
    case argumentIsNotUTF8
}

There’s nothing unsafe here, so the worst it can do is return the wrong results. Which is always going to be a risk anyway because this data is under the control of the target process.

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

Just as a reminder, I encourage folks to reply in a reply rather than in the comments. See tip 5 in Quinn’s Top Ten DevForums Tips.

alexrepty wrote:

it's been a few years since you wrote that response and I'm wondering if anything has changed since then. Any new API?

Not that I’m aware of. I checked on the state of FB9149624 and it remains unfixed.

Should we continue to dupe that FB … ?

Yes. The nice thing about filing a dup is that Feedback Assistant will tell you when (well, if :-) the bug gets fixed.

to help with prioritisation

I can’t make any promises on that front.

Are there any current alternatives to this approach, to get the information for arbitrary processes?

No. The problem is that the system hasn’t stored a copy of the arguments in a trusted place. They only copy is the one at the top of the main thread’s stack, and that can’t be trusted. It’s hard to see how we could provide a trusted API for this without significantly increasing system memory [1].

I know there's es_exec_arg() in the endpoint security framework, but as I understand it, for that approach I'd have to catch the process launch

Yep. And ES a very heavy hammer. If you have an ES client for other reasons, this might be a reasonable option, but I’m reluctant to suggest adding an ES client just for this purpose.

IMPORTANT If you do use ES for this, remember that:

  • ES clients must have excellent performance.

  • ARG_MAX is pretty huge these days (1 MiB), and it’s that big because some programs, especially developer tools, need it to be that big.

You’ll have to code your ES client to efficiently store the argument list, in terms of time and space.

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

[1] We can’t make it read-only because processes do modify this memory and would fail if it weren’t read/write. I guess making it COW would work, because most processes don’t actually modify the memory, but …

Well, I’m rambling at this point. It’s not my job to fix problems like this, which is probably a good thing (-:

Process arguments from audit token
 
 
Q