Why can't MSL cast float4x4 to float3x3?

The memory layout doesn't change in this sort of cast, and this is a common construct when transforming normal and tangents.

float3 normal = input.normal * (float3x3)skinTfm;

no matching conversion for functional-style cast from 'metal::float4x4' (aka 'matrix<float, 4, 4>') to 'metal::float3x3' (aka 'matrix<float, 3, 3>')

Since there are never any responds to posts to the forums. I'll just post the solution for now. This is less than ideal, since it creates a whole new matrix, when a cast should be fine and is in HLSL. Neither cast not construction works directly from a float4x4/half4x4, but this does.

inline float3x3 tofloat3x3(float4x4 m) {
    return float3x3(m[0].xyz, m[1].xyz, m[2].xyz);
}
inline half3x3 tohalf3x3(half4x4 m) {
    return half3x3(m[0].xyz, m[1].xyz, m[2].xyz);
}

Just wanted to share my observations about performance on Apple M1 (although I suspect similar for A series).

I had done something similar and was surprised (yup, I'm a noob) to find that creating a whole new matrix doesn't impact performance at all...

Confirmed with official Xcode compiler statistics (with and without float3x3 conversion)...

Hurray scalar ALUs and optimizing compilers.

Why can't MSL cast float4x4 to float3x3?
 
 
Q