There are some related posts on the forums that seem inconclusive and also a bit more complex. I pared my testing down to a couple of brief XCTests with self.measure blocks on repeated add and multiply operations of two double values. The tests set random initial values to ensure there's no compiler optimization of loop calculations based on constants. There's also no big collection of fixture data, so there's no chance allocations or vector index dereference or similar issues could be involved.
The regular multiple-instruction code runs an order of magnitude faster than the SIMD code! I don't understand why this is - would the SIMD code be faster in C++, could it be some Swift conversion? Or is there some aspect of my SIMD code that is incurring some known penalty? Curious if anyone out there is using SIMD in Swift in production and if you see anything in my test code that explains the difference.
Code Block Swift func testPerformance_double() { var xL = Double.random(in: 0.0...1.0) var yL = Double.random(in: 0.0...1.0) let xR = Double.random(in: 0.0...1.0) let yR = Double.random(in: 0.0...1.0) let increment = Double.random(in: 0.0...0.1) Swift.print("xL: \(xL), xR: \(xR), increment: \(increment)") var result: Double = 0.0 self.measure { for _ in 0..<100000 { result = xL + xR result = yL + yR result = xL * xR result = yL * yR xL += increment yL += increment } } Swift.print("last result: \(result)") // read from result }
Code Block Swift func testPerformance_simd() { var vL = simd_double2(Double.random(in: 0.0...1.0), Double.random(in: 0.0...1.0)) let vR = simd_double2(Double.random(in: 0.0...1.0), Double.random(in: 0.0...1.0)) let increment = Double.random(in: 0.0...0.1) let vIncrement = simd_double2(increment, increment) var result = simd_double2(0.0, 0.0) Swift.print("vL.x: \(vL.x), vL.y: \(vL.y), increment: \(increment)") self.measure { for _ in 0..<100000 { result = vL + vR result = vL * vR vL = vL + vIncrement } } Swift.print("last result: \(String(describing: result))") }
The measurements show the block with SIMD operations taking an order of magnitude more time than the multiple operations!
...testPerformance_double measured [Time, seconds] average: 0.049, relative standard deviation: 3.059%, values: [0.049262, 0.049617, 0.048499, 0.047859, 0.048270, 0.048564, 0.047529, 0.052578, 0.047267, 0.047432], performanceMetricID:com.apple.XCTPerformanceMetric_WallClockTime, baselineName: "", baselineAverage: , maxPercentRegression: 10.000%, maxPercentRelativeStandardDeviation: 10.000%, maxRegression: 0.100, maxStandardDeviation: 0.100
...testPerformance_simd measured [Time, seconds] average: 0.579, relative standard deviation: 5.932%, values: [0.626196, 0.605790, 0.635180, 0.611197, 0.553179, 0.548163, 0.552648, 0.549264, 0.552745, 0.551465], performanceMetricID:com.apple.XCTPerformanceMetric_WallClockTime, baselineName: "", baselineAverage: , maxPercentRegression: 10.000%, maxPercentRelativeStandardDeviation: 10.000%, maxRegression: 0.100, maxStandardDeviation: 0.100