Why can't MSL cast float4x4 to float3x3?

The memory layout doesn't change in this sort of cast, and this is a common construct when transforming normal and tangents.

float3 normal = input.normal * (float3x3)skinTfm;

no matching conversion for functional-style cast from 'metal::float4x4' (aka 'matrix<float, 4, 4>') to 'metal::float3x3' (aka 'matrix<float, 3, 3>')

Replies

Since there are never any responds to posts to the forums. I'll just post the solution for now. This is less than ideal, since it creates a whole new matrix, when a cast should be fine and is in HLSL. Neither cast not construction works directly from a float4x4/half4x4, but this does.

inline float3x3 tofloat3x3(float4x4 m) {
    return float3x3(m[0].xyz, m[1].xyz, m[2].xyz);
}
inline half3x3 tohalf3x3(half4x4 m) {
    return half3x3(m[0].xyz, m[1].xyz, m[2].xyz);
}

Just wanted to share my observations about performance on Apple M1 (although I suspect similar for A series).

I had done something similar and was surprised (yup, I'm a noob) to find that creating a whole new matrix doesn't impact performance at all...

Confirmed with official Xcode compiler statistics (with and without float3x3 conversion)...

Hurray scalar ALUs and optimizing compilers.