Low level API to take control of Neural Engine

Hi, I would love to code with the Neural Engine on my macbook pro M1 2020.

Is there any low-level API to create my very own work-loads?

I am working with audio and MIDI. As well sound synthesis and mixing.

Can I use the Neural Engine to offload the CPU? I am especially interested in parallelism using threads.

My programming lanuage of choice is ANSI C and Objective C.

Accepted Reply

This seems to be what I need. Vector operations and such.

https://developer.apple.com/library/archive/documentation/Performance/Conceptual/vDSP_Programming_Guide/Introduction/Introduction.html

and

https://developer.apple.com/documentation/accelerate/veclib

The Neural Engine looks versatile.

Replies

This seems to be what I need. Vector operations and such.

https://developer.apple.com/library/archive/documentation/Performance/Conceptual/vDSP_Programming_Guide/Introduction/Introduction.html

and

https://developer.apple.com/documentation/accelerate/veclib

The Neural Engine looks versatile.

Does vecLib use the Neural Engine, or it executes on the GPU?
  • For vecLIb Accelerate uses undocumented AMX coprocessor. Libraries that also are accelerated: vImage — high level image processing, converting between formats, image manipulation. BLAS — industry standard for linear algebra BNNS — running and training neural networks vDSP — digital signal processing. Fourier transformations, convolution, image processing or any signal really including audio. LAPACK — high level linear algebra functions, for solving linear equations.

Add a Comment

I feel you have a bit wrong idea what Neural Engine is. It have nothing common to CPU or GPU - it's just a machine learning accelerator with very limited area of application - even not every layer type in your model can be used. Look at This FAQ for better understanding what it is and what it can.

Q: Is there any low-level API to create my very own work-loads?

A: Yes and No. Low level AppleNeuralEngine.framework is private to Apple and you can't use it.

But:

take a look at ANE Tools - compiler and decompiler for Neural Engine. Also there is coremltools - this will help to interface with TensorFlow and PyTorch

Q: Can I use the Neural Engine to offload the CPU? I am especially interested in parallelism using threads.

A: Basically - No. ANE can't execute CPU/GPU code and don't have threads. It operates with layer connectivity map and net weights.

One more thing - vDSP and veclib DONT use ANE.

UPDATE: Famous reverse engineer Geohot exposed some examples of using AppleH11ANEInterface

Also you can look at Wish Wu presentation on BlackHat Asia 2021 about ANE internals. Sorry, Apple not allows to paste it's link or attach file here - use google.

Based on these two sources, you can conclude that only thing ANE can do well - convolutions. Perhaps it was intentionally designed as convolution accelerator. Other AI/ML things, even so easy as ReLU is 20 times slower than CPU.