Post

Replies

Boosts

Views

Activity

Reply to Why i enabled Metal API in `encode` function but my Coreml custom layer still run on CPU
I have also been having issues getting the encodeToCommandBuffer function to be called. One thing that I had to do was make sure the input was big enough. When running a custom layer on an image with (1, 8, 32, 32) shape for example, the CPU implementation was called. When I scaled that up to (1, 96, 256, 256), for example, it caused the GPU function to be called. There is some heuristic inside CoreML that looks at tensor size when determining to run a custom layer on the CPU or the GPU. I'm not sure if it looks at the size of outputs, inputs, or what - but it's doing something like that. Likely dependent on the device you're running on as well - a debug/verbose mode would be really nice to figure out how CoreML is arriving to these decisions. I also noticed that the input tensor had to be somewhat 'image shaped'. When attempting to pass (1, 500_000, 1, 1) through a custom layer, it was executed on the CPU. When passing (1, 50, 100, 100) through (same size as previous shape), it was executed on the GPU. I guess this has to do with the encodeToCommandBuffer function accepting MTLTexture objects as inputs/outputs: maybe there is some limit on the number of channels, etc.
Jun ’23