Post

Replies

Boosts

Views

Activity

Reply to How do we use the computational power of A17 Pro Neural Engine?
Now I found the descriptions in coremltools document. In newer hardware, e.g. iPhone 15 pro (A17 pro), there is increased int8-int8 compute available on NE Impact on Latency and Compute Unit Considerations https://apple.github.io/coremltools/docs-guides/source/quantization-overview.html#impact-on-latency-and-compute-unit-considerations Linear 8-Bit Quantization https://apple.github.io/coremltools/docs-guides/source/performance-impact.html#linear-8-bit-quantization The key point for A17 Pro is to quantize both weights and activations by per-tensor quantization.
May ’24
Reply to Are there any Conditional operators in vDSP?
I might solve it by myself. let numerators: [Float] = ... let denominators: [Float] = ... // actually these are integers in my case let denominatorsEpsilon = vDSP.add(Float.leastNonzeroMagnitude * 10000000, denominators) // (Float.leastNonzeroMagnitude * 10000000 is minimal number to avoid NaN when dividing. let divides = vDSP.divide(numerators, denominatorsEpsilon) let alternativesOfNAN: [Float] = ... let denominatorsClip = vDSP.clip(denominators, to: 0...1) // denominatorsClip are 0 or 1 since denominators are integers let result = vDSP.subtract(multiplication: (divides, denominatorsClip), multiplication: (alternativesOfNAN, vDSP.add(-1, denominatorsClip))) I don't like this code since it is not precise and includes many unnecessary operations, but it is much faster than code checking isNaN.
Feb ’24
Reply to Are there any Conditional operators in vDSP?
Sorry, I misunderstood. What I need is masking nan values, not gathering non-nan values. As you said, vDSP_vdiv returns nan(, not inf) when you divide by zero. And I cannot find any effective compare operations to mask nan. Are there any good functions? What I am really doing is, let numerators: [Float] = ... let denominators: [Float] = ... let divides = vDSP.divide(numerators, denominators) let alternativesOfNAN: [Float] = ... for i in divides { if divides[i].isNaN { divides[i] = alternativesOfNAN[i] } } The last loop is quite slow compared with other parts using vDSP.
Feb ’24