Is device.currentAllocatedSize and gpu capture buffer memory accurate on iOS/iPad?

I see reasonable numbers from this on macOS, but on iPad I see really large numbers from this, and in the gpu capture that don't add up. This is Xcode 12.2 and and iPad 14.0.1.

Textures and Buffers add up to 261MB which is close to the macOS. The memory summary, and the "other" area in the buffers area report 573MB when I hover over that. Also device.currentAllocatedSize reports 868MB total. I assume the buffer size is skewing the memory totals, since Xcode reports 620MB for the entire app.

I would attach a screenshot of the gpu capture showing the memory capture, but seems that the new forums don't support this, and not being able to search categories anymore is rather limiting.

Non-voliatile 261
Volatile 0

Textures 195
Buffers 66 <- but hover over "other" reports 573

Private 184
Shared 77

Used 166
Unused 95

When you read device.currentAllocatedSize, were you performing a capture with the frame debugger? The frame debugger allocates a considerable amount of memory to track resources. While Xcode will not show this memory since it tried to reflect the state of execution under normal circumstances, this memory may show up in device.currentAllocatedSize.

We're not entirely sure why you're seeing 573MB when you've only allocated 261MB of resources. There is typically some memory allocated by the driver in the "other" category for bookkeeping and saving of transformed vertex data for the deferred fragment stage. However, what you're seeing is much more than we'd expect.

If you create a request with Feedback Assistant and include a test and/or other executable, we can look at where this is coming from (and hopefully provide a more accurate information in Xcode in the future). The more data you provide, the better. Please repost the FB number here.
In both cases, the capture was enabled on macOS and iOS. But only iOS listed a 500mb increase when the capture was performed in the Memory Viewer of gpu capture. The macOS numbers were reasonable, and what I expected. When I disabled gpu capture, the memory difference reported by Xcode's memory panel (not gpu capture) was only 10MB or so.

We have our own reporting system, so I'll dump the numbers with and without capture enabled below. I was seeing numbers similar to this when vertex buffers weren't de-duped before being added up. That happens when you have sub-ranges of vertices/indices all packed into a single VB/IB.

With and without capture enabled, the amount reported is the same by our tool that accumulated the de-duped data:
Total device size: 865 MB
Total vertex size: 588 MB
Total texture size: 198 MB



Sorry. I'm unclear. Is there still an issue now that you're seeing that device.currentAllocatedSize is close to what Xcode supports. Are you still seeing 573MB of "other" which you can't account for vs the 66MB you expect?

It looks like on buffers, all of our iOS buffers are allocated to 128K minimum for even small uniform buffers (as reported by MTLBuffer.allocatedSize). device.currentAllocatedSize also reflects this larger total. But on macOS with the same Metal code, the buffers are 4K or less. This made this hard to track down, since it was 4300 small MTLBuffers x 128K = 530MB. Is there some large page table present on iPadOS that is killing us here that isn't the case on macOS (Intel).

So page sizes are 4x what we see on macOS. That does waste memory. It’s 16K, not 128K though. But still if a MTLResource is mapped to a page minimum, then this does waste more space on iOS.

iPadOS
expr (int)getpagesize()
(int) $1 = 16384

macOS
expr (int)getpagesize()
(int) $0 = 4096

Also this is strange, some of our buffers < 128K allocated, but the super small ones always return 128K. It's like Metal is sticking super tiny allocations into a shared buffer, and then we (and gpu capture) are double-counting the memory.

VertexData1 request:0.046 actual:0.047 MB
VertexData
0 request:0.000 actual:0.125 MB
VertexData_2 request:0.000 actual:0.125



Usually many MTLBuffer objects are allocated within a master buffer which is 128KB aligned. The size in MTLBuffer.allocatedSize the size of this master buffer.

We just figured out that this is actually an accounting issue where we're not calculating the correct currentAllocatedSize and Xcode is also not showing the correct size for buffers allocated. We're looking at a fix.

Is your app getting a memory warning from the system or being killed due to memory pressure? What are you using this number for?
We are using this number to sanity check our buffer and texture usage against the internal numbers that we have. For example, on macOS 3D textures were 8x bigger on non-pow2 textures than what the dimensions would suggest (384x384x3 texture). The texture correspondence looked reasonable on macOS and iOS (maybe 200K off from our totals).

The buffers on iOS were way off (totaling up de-duped allocatedSize), but then device.currentAllocatedSize also reflected this much larger size that indicated that it was totaling up these larger values.

We aren't hitting memory limits on this test, since my device has 4GB and we're around 0.8GB. In the past, though, we have had small buffer allocations (one quad per buffer) report 1gb of buffer memory use so we ended up optimizing it to consolidate those.

Also gpu capture is reporting incorrect memory totals as a result of this so validation has to be done on macOS. And macOS reports 220MB less than iOS even in the Xcode memory totals. Before we try to consolidate to fewer MTLBuffer, having allocatedSize/currentAllocatedSize be correct on iOS would really help.




Sorry, it doesn't look like the numbers we're giving you are actually sane.

Can you create a request via Feedback Assistant, providing any detail (and perhaps a case to reproduce this issue)? We'd like to have something to verify against once we have a fix for this accounting issue you're hitting.
Here's the Feedback number.

FB8934899 (MTLBuffer.allocatedSize and MTLDevice.currentAllocatedSize return 128K master buffer size)

My workaround for now is to see if allocatedSize returns 128*1024, and if so then use buffer.length. This gives reasonable numbers even if they're not exactly what the system allocated. Also we don't use the device.currentAllocatedSize since it's unreliable with these buffers overcounted.
Was there ever a solution here? We are working on a game that reports up to 400 MB more than expected on iOS, and we seem to be seeing some very similar issues to what you were seeing.
@AlecazamTGC would really love your help or an Apple developer's if possible. This seems to be causing our game to crash way before it normally would due to increased memory pressure that I am not sure is actual memory being used. It is very odd. I've been investigating this same issue for a few months now on and off, and after finding this thread it made me realize I wasn't going crazy with the numbers I was seeing. It would be fine if it was just a reported difference, but it appears to actually affect when the game gets killed by the OS, so that's a big problem.

It's really a big problem which greatly increases our app crash probability.

That's the statistics in our game app: Buffers 53 <- but hover over "other" reports 480

It looks like the same reason as AlecazamTGC feedback. A large amount of buffer allocatedSize is 128k.

I wander if there is a way that we can reduce the memory allocate overhead before apple fix it?

Is device.currentAllocatedSize and gpu capture buffer memory accurate on iOS/iPad?
 
 
Q