When converting from Caffe to CoreML models using coremltools, there is unfortunately no option to specify whether it should run in half or full precision.
I would like to explicitly make use of the fp16 ALU's on the GPU's of e.g. A8/A9 during inference.
Is there a way to do this? Or does it happen magically anyway, optimizing the ALU utlization of the underlying hardware?