Any Demo for Metal Struct simdgroup_float8x8 and simdgroup_half8x8 ?

I have a metal compute kernel for dense matrix mutiply, and I'd like to optimize it with simdgroup_float8x8 and simdgroup_half8x8.

However, it seems no one apply them in Metal.

Can you give me some more demo on how to use them excpet that in Metal Shading Language Specification Version 2.4. Thanks!

Hi, have you tried MPSMatrixMultiplication? It should use this features when possible and it supports fp16/fp32 precision.

Any Demo for Metal Struct simdgroup_float8x8 and simdgroup_half8x8 ?
 
 
Q