I am rendering 2.7 million vertices.when I don't use transparency(means opacity is 1.0) for my vertices and Alpha blending remains enabled then my fragment function does not take much time and FPS is 60. as soon as I give opacity 0.5 for each vertex then Fragment function becomes very slow and my FPS is coming down to 23. please anyone give me any hint what could be the issue? I profiled app using Xcode's GPU debugger and Xcode does not show any performance issue.
I am attaching two screen shots of performance timeline and performance counters taken from Xcode Metal Debugger tool. as we can see render command encoder is taking 45 .16 ms in ScreenShot-1 and that is because of fragment function which takes same time 45.15 ms that we can see in ScreenShot-2.
please see attached screens shots and let me know if someone sees any issue with any performance counter and where should I investigate?
Post
Replies
Boosts
Views
Activity
I have a MTLBuffer attached to my fragment shader and data format of this MTLBuffer is an struct which has an array of another struct and a count variable as given below
struct Data{//has some vector variables};
Struct Lights
{
Int light_count;
Data light_data[10];
};
I just populate 1 light data in the mtlbuffer means mtlbuffer is not fully populated . So in this case, fragment function becomes too slow if I change array size of light_data to 1 and then fragment shader becomes fast.
Another thing if I populate mtlbuffer fully and access light_data in for loop in my fragment function then again it becomes too slow.
Does anybody have any idea why this problem is there? and please tell me any solution to make fragment function fast ? or should I format my Lights struct in different way so that fragment function can access it fast.
Metal shaders use two 16-bit GPU registers for one 32 float variable because A8 and later GPUs have 16-bit registers only. so if we use many variables in our metal fragment shaders then we get the issue of "exceeding registers limit and data spilling in slower GPU memory" and it causes very slow FPS.
but why is this not with OpenGL shaders even though they are using many 32-bit float variables like metal shaders. any idea what trick does OpenGL ? while GPU is same in both cases Metal rendering and OpenGL rendering.
Metal shading language has half data type and it is recommended to use half for better performance of shader code. But we populate our MTLBuffer from CPU side and CPU does not support half precision float then in that case, how can we store our CPU float data as half in MTLBuffer?
I am rendering around 12K vertices . I have some fragment functions which are lengthy around 500 lines . these shaders are doing some light calculation and that's why accessing constant buffers. so problem is these shaders are taking too much time around 50-70 ms just to render 12K vertices.
when I profiled my app then Xcode tells me that there is 4KiB memory spilling in the fragment function. fragment function is wasting 60% time in waiting memory read from buffers. I can't optimise my fragment shaders and I am surprised why Metal APIs is too slow just to render very few vertices with lengthy fragment function. what could be the problem ? is there any issue while binding my buffers to the fragment shaders ? why is metal fragment function too slow in comparison to openGL?
Hi, I have 7 rects with 7 different textures. I use single RenderCommandEncoder to bind each rect's texture to 0th index of my fragment function by calling setfragmenttexture as given below:
[ renderCommandEncoder setfragmenttexture:rects[i].mtltexture atIndex:0]
Then what happens rects[0], rects[1] get rendered with rects[0]'s texture and
rects[2], rects[3] get rendered with rects[1]'s texture and so on.
While each rect must be rendered with own texture because I set it everytime in loop correctly. But it is not happening.
I am not able to understand why setfragmenttexture is not updating correct texture. Please help me what could be the issue here ?