Post marked as solved
Post marked as solved with 4 replies, 1,914 views
I have an app in Swift that does a lot of numerical processing on ordered pairs and vectors, so I'm looking into some ways to improve performance including adopting SIMD from the Accelerate framework for some calculations, but I'm not seeing performance improve.
There are some related posts on the forums that seem inconclusive and also a bit more complex. I pared my testing down to a couple of brief XCTests with self.measure blocks on repeated add and multiply operations of two double values. The tests set random initial values to ensure there's no compiler optimization of loop calculations based on constants. There's also no big collection of fixture data, so there's no chance allocations or vector index dereference or similar issues could be involved.
The regular multiple-instruction code runs an order of magnitude faster than the SIMD code! I don't understand why this is - would the SIMD code be faster in C++, could it be some Swift conversion? Or is there some aspect of my SIMD code that is incurring some known penalty? Curious if anyone out there is using SIMD in Swift in production and if you see anything in my test code that explains the difference.
func testPerformance_double() {
var xL = Double.random(in: 0.0...1.0)
var yL = Double.random(in: 0.0...1.0)
let xR = Double.random(in: 0.0...1.0)
let yR = Double.random(in: 0.0...1.0)
let increment = Double.random(in: 0.0...0.1)
		Swift.print("xL: \(xL), xR: \(xR), increment: \(increment)")
var result: Double = 0.0
self.measure {
for _ in 0..<100000 {
result = xL + xR
result = yL + yR
result = xL * xR
result = yL * yR
xL += increment
yL += increment
}
}
Swift.print("last result: \(result)") // read from result
}
func testPerformance_simd() {
var vL = simd_double2(Double.random(in: 0.0...1.0), Double.random(in: 0.0...1.0))
let vR = simd_double2(Double.random(in: 0.0...1.0), Double.random(in: 0.0...1.0))
let increment = Double.random(in: 0.0...0.1)
let vIncrement = simd_double2(increment, increment)
var result = simd_double2(0.0, 0.0)
Swift.print("vL.x: \(vL.x), vL.y: \(vL.y), increment: \(increment)")
self.measure {
for _ in 0..<100000 {
result = vL + vR
result = vL * vR
vL = vL + vIncrement
}
}
Swift.print("last result: \(String(describing: result))")
}
The measurements show the block with SIMD operations taking an order of magnitude more time than the multiple operations!
...testPerformance\_double measured [Time, seconds] average: 0.049, relative standard deviation: 3.059%, values: [0.049262, 0.049617, 0.048499, 0.047859, 0.048270, 0.048564, 0.047529, 0.052578, 0.047267, 0.047432], performanceMetricID:com.apple.XCTPerformanceMetric\_WallClockTime, baselineName: "", baselineAverage: , maxPercentRegression: 10.000%, maxPercentRelativeStandardDeviation: 10.000%, maxRegression: 0.100, maxStandardDeviation: 0.100
...testPerformance\_simd measured [Time, seconds] average: 0.579, relative standard deviation: 5.932%, values: [0.626196, 0.605790, 0.635180, 0.611197, 0.553179, 0.548163, 0.552648, 0.549264, 0.552745, 0.551465], performanceMetricID:com.apple.XCTPerformanceMetric\_WallClockTime, baselineName: "", baselineAverage: , maxPercentRegression: 10.000%, maxPercentRelativeStandardDeviation: 10.000%, maxRegression: 0.100, maxStandardDeviation: 0.100