swift & simd

I've been porting Objective-C code to Swift 2.2 / Xcode 7.3.1.


But if I use some simd types they are not defined. For instance, vector_float16 is not defined. But if I look in the header, then I see something called _METAL_VERSION_ which is defined and causes vector_float16 to not be defined. I have the Accelerate framework linked to the project. And my Objective-C files build with vector_float16 but Swift does not.


Example:


import simd


let a = vector_float4(1.0,2.0,3.0,4.0) // works

let b = vector_float16(1.0,2.0,3.0,4.0) // vector_float16 not defined


Is there some project setting I'm missing? I could fall back to vDSP_vmul but I find using simd much easier.

Replies

Also, sometimes I attempt to build out a 4 by 4 matrix and I get the following compiler warning:


"error: expression was too complex to be solved in reasonable time; consider breaking up the expression into distinct sub-expressions"


Yet, the code is valid. But its attempting to second guess what I've coded. The more I change the more likely the regression test will fail.

I don’t have any answers for your

vector_float16
question, alas.

With regards your “expression was too complex” problem, this is one of the known gotchas with the Swift type inferencing system. Simple examples like this work:

import simd
let m1 = float4x4([
    [1, 2, 3, 4],
    [1, 2, 3, 4],
    [1, 2, 3, 4],
    [1, 2, 3, 4]
])

but seemingly innocuously looking extensions cause real problems:

let m2 = float4x4([
    [1, 2, 3, 2 + 2],
    [1, 2, 3, 2 + 2],
    [1, 2, 3, 2 + 2],
    [1, 2, 3, 4]
])

In most cases you can resolve this with some strategic type specifications. For example, you can fix the above by doing this.

let m3 = float4x4([
    [1, 2, 3, (2 + 2) as Float],
    [1, 2, 3, (2 + 2) as Float],
    [1, 2, 3, (2 + 2) as Float],
    [1, 2, 3, 4]
])

If you hit something you can’t resolve, please post it.

Share and Enjoy

Quinn “The Eskimo!”
Apple Developer Relations, Developer Technical Support, Core OS/Hardware

let myEmail = "eskimo" + "1" + "@apple.com"

I don’t have any answers for your vector_float16 question, alas.

I ran this past one of our numerics experts and he confirmed that Swift does not yet support

vector_float16
. Sorry I don’t have better news.

Share and Enjoy

Quinn “The Eskimo!”
Apple Developer Relations, Developer Technical Support, Core OS/Hardware

let myEmail = "eskimo" + "1" + "@apple.com"

Thanks for the heads up about type inferencing. Ya, the top three rows all had many additions. When porting code from Objective-C, this problem has popped up a few times. I notice the literal type defaults to Double and I have to be more explicit with the Float type.


I had some product results from vDSP_vmul accelerate methods. ( or what was the simd length 16 in Objective-C ) These results were being added together in the matrix array decleration. I figured there were enough adds to just call vDSP_vadd. Then I put those results into the matrix and the type inferencing issue went away. Ironically, Swift proded me to do an improved solution.


Regarding arrays, I had another simple question but I want to be sure.😁 Again, just to be sure 100%, what is the proper Swift analog to a C array of a packed type on the stack -- not an array malloced on the heap. That is, the stack would contain memory of the atomic type float for instance without any other data structures. I'm assuming its Array of Float. And Array is going to work off a buffer of packed floats internally.(?) Looking at other Swift examples, it looked like Array was the choice because I could pass it to C methods from Swift easily. I researched more into Array and I found that Array acts a little different from what I initially read about structs in general because its copy on mutate. So, I assume if I pass an Array into a method its reference is passed without a copy. But the moment its contents change or mutate, then there is a copy. So, as a theoretical question, if I had an Array that was 10K rows by 10K columns, if I tweaked one row, then it would copy 10K floats because the row is an Array struct. But in C it would modify the one Float.(?) But I get a little worried there might be copies. Doing an alloc on the heap was not an alternative because its slow.

Regarding arrays, I had another simple question …

That’s your idea of a simple question (-:

First things first, Swift arrays are (almost?) always allocated on the heap. The

Array
type itself is a struct, which is allocated on the stack, but the contents of the array is held in a heap-based buffer. It’s this indirection that allows for efficient copy on write (COW), appending and removing elements, and so on.

Swift has no direct equivalent of C’s fixed sized array. When the C importer sees a C fixed sized array, it imports it as an N-element tuple, which works but is super kludgy and not the way you’d write the code in Swift itself.

DSP-ish stuff isn’t really my thing, so I don’t have a lot of direct experience here, but if I were in your shoes I’d build an abstraction layer around these large buffers of floats so that you can explicitly manage their allocation, copying and destruction. If you’re explicitly trying to deal with mutable state (and thus avoid COW), you can make this a class and pass around a reference.

One of the nice things about Swift is that you can build abstraction layers like this without (necessarily) paying a runtime performance cost. Specifically, if you declare your class

final
then Swift can devirtualise, inline, and so on.

Here’s a very quick example of what I’m talking about:

final class FixedBuffer {

    init() {
        self.buffer = UnsafeMutablePointer<Float>.allocate(capacity: 1024 * 1024)
        self.buffer.initialize(to: 0.0, count: 1024 * 1024)
    }

    deinit {
        print("deinit")
        self.buffer.deinitialize(count: 1024 * 1024)
        self.buffer.deallocate(capacity: 1024 * 1024)
    }

    fileprivate var buffer: UnsafeMutablePointer<Float>

    subscript(row: Int, column: Int) -> Float {
        get {
            precondition((0..<1024).contains(row))
            precondition((0..<1024).contains(column))
            return self.buffer[row * 1024 + column]
        }
        set {
            precondition((0..<1024).contains(row))
            precondition((0..<1024).contains(column))
            self.buffer[row * 1024 + column] = newValue
        }
    }
}

func inner(b: FixedBuffer) {
    print(">inner")
    b[1, 1] = 1.0
    print("<inner")
}

func outer() {
    print(">outer")
    let b = FixedBuffer()
    print(b[0, 0])
    print(b[1, 1])
    inner(b: b)
    print(b[0, 0])
    print(b[1, 1])
    print("<outer")
}

outer()

It prints:

>outer
0.0
0.0
>inner
<inner
0.0
1.0
<outer
deinit

Another nice thing about this approach is that you can build specific adapters for specific APIs. For example, if you need to call some vDSP function that takes a traditional C array of Floats, you can write a method that calls a closure with the parameters you need. For example:

extension FixedBuffer {   

    func with<Result>(row: Int, body: (_ rowBase: UnsafeMutablePointer<Float>, _ count: Int) throws -> Result) rethrows -> Result {
        return try body(self.buffer + row * 1024, 1024)
    }
}

which you can call like this:

b.with(row: 0) { (rowBase: UnsafeMutablePointer<Float>, count: Int) in
    … call vDSP here …
}

Because the closure is non-escaping (the new default in Swift 3) this sort of thing can be really efficient.

Share and Enjoy

Quinn “The Eskimo!”
Apple Developer Relations, Developer Technical Support, Core OS/Hardware

let myEmail = "eskimo" + "1" + "@apple.com"

Its good to know Array is using the heap. But I tended to want to avoid the heap in the event virtual memory gets paged out to disk -- this might cause a frame skip or pause. Probably would not be noticable if its vanilla user interface code. Perhaps, hardware and the operating system has grown more advanced. Given that most modern hardware is not using hard drives anymore this might not be as much of an issue -- my mac mini does though; I had to max out the ram hardware upgrade just to run osx server and before I could hear the disk constantly grinding. 😊 I kind of recollect iOS does not use virtual memory. ( dont quote me ) So, the trick is to avoid the heap and make it small enough to be resident in the processors cache.


Maybe the best approach for now is to type alias the Array. So, I can try different implementations -- like your suggestion. In each case I can then benchmark.


Thanks again.

But I tended to want to avoid the heap in the event virtual memory gets paged out to disk -- this might cause a frame skip or pause.

The heap and the stack are not different in this respect. On platforms that support anonymous virtual memory (macOS), they can both be paged out. On platforms that don’t (iOS and its descendents), they are both always resident.

I kind of recollect iOS does not use virtual memory.

iOS does support virtual memory, just not anonymous virtual memory. The VM system can page to and from files (most commonly this is code being paged in from an executable, but also memory mapped files), it’s just that there’s no default pager so stuff allocated on the heap always remains resident.

For more info, see this post on the old DevForums (yikes that’s a long time ago!).

In each case I can then benchmark.

Can’t disagree with that. It’s very easy to speculate about what performs well (as I’ve been doing here) but such speculations are often incorrect.

Share and Enjoy

Quinn “The Eskimo!”
Apple Developer Relations, Developer Technical Support, Core OS/Hardware

let myEmail = "eskimo" + "1" + "@apple.com"