Hello everyone.
We have recently starter running our engine on iOS devices and I am looking into performance. I decided to look at some hints we get from the tools and one of them is the "Buffer Preloading Failed" that shows up on all oru pielines regarding all vertex buffers we use.
To give an example we get the following message for one of our pielines:
Buffer Preloading Failed
Make sure your data size is a multiple of 4 bytes and aligned to 4 bytes
and try using a simple access pattern. For constant buffers, try using a fixed buffer size.
vertexBuffer.0 could not be promoted - tmp.115.FLcq4C.metal:RenderSceneVS
I have already checked the data in vertexBuffer.0 has a size that is a multiple of 4 bytes (its 8 float3 entries). I also think it's aligned to 4 bytes - it's at an offest 0 of it's own MTLBuffer.
The access pattern should also be okay - we are using vertex descriptors and this is the only vertex buffer bound (it uses buffer slot 16). Apart from that there are two constant buffers bound 0 and 4. The buffer at slot 9 is not used and thus not bound for this call.
The compilation target is iOS 11.4 and the metal version requested is 2.0.
The shader itself looks like this (minus the unused functions I removed)
#include <metal_stdlib>
#include <metal_geometric>
using namespace metal;
constexpr static const constant int kLightCountMax = 3;
struct ViewConstantBuffer
{
float4x4 g_mViewProjection;
float4 g_CameraPosition;
float4x4 g_primaryCameraMatrix;
float4 g_mSoftwareViewport;
float g_alphaFlag;
};
struct ObjectConstantBuffer
{
float4x4 g_mWorld;
};
struct LightingConstantBuffer
{
float4x4 g_shadowViewProjection[kLightCountMax];
float4 g_lLightPositionRadius[kLightCountMax];
float4 g_lLightColorType[kLightCountMax];
float4 g_shadowMapControls[kLightCountMax];
float4 g_shadowMapControls2[kLightCountMax];
};
struct VS_INPUT
{
float4 Position [[attribute(0)]];
};
struct VS_OUTPUT
{
float4 Position [[position]];
float3 UVW;
};
vertex VS_OUTPUT RenderSceneVS(VS_INPUT vsInput [[stage_in]],
constant ObjectConstantBuffer& object [[buffer(0)]],
constant ViewConstantBuffer& view [[buffer(4)]],
constant LightingConstantBuffer& lighting [[buffer(9)]] )
{
VS_OUTPUT vsOutput;
float4 modelPos = vsInput.Position;
float4 worldPos = ((modelPos) * (object.g_mWorld));
float4 worldViewProjPos = ((worldPos) * (view.g_mViewProjection));
vsOutput.UVW = modelPos.xyz;
vsOutput.Position = worldViewProjPos;;
return vsOutput;
}
Does anyone know why the Metal runtime would not be able to preload the vertex buffer? Or maybe someone is aware of some tools I can use to debug this issue (I already have the API validation turned up to Extended)?