I finally got this to work by designating the Storyboard as a NSView (Instead of NSImageView), and then adding the NSImageView with addSubview of that. But now I need the subview to auto-layout to the parent's bounds. That's at least a fixable problem.
Post
Replies
Boosts
Views
Activity
The proposed solution of explicitly setting x86_64 arm64 on macOS targets doesn't work for me either with Xcode 13.1. My .a library files, and frameworks all report that they are missing the arm64 architecture when trying to build for "Any Mac". This is because the scheme is "Debug" and that activates active arch only (in my case x86_64).
The debug setting of "build for active arch only" is not being overridden by the setting to build a universal app when "Any Mac" is set. Setting the "Build" target to "Release" avoids all the crazy errors when set to "Any Mac". But then there's a crazy error from my use of modules. mm_malloc.h must not have an equivalent on macOS arm64, or the module isn't built properly for the arm64 build. This was originally from sse intrinsics, so I'll look to replace that with something uninverals.
#include <mm_malloc.h>
Module '_Builtin_intrinsics.intel.mm_malloc' requires feature 'x86'
Also isn't CADisplayLink supposed to synchronize the frames. And we have a semaphore that counts down from the drawable count when before we request nextDrawable which also seems redundant. The problem is that the program thread wants to get onto the next frame, but can't due to the requirement to call nextDrawable. Even Apple's docs state this should be called as late in the frame as possible, but when that call takes 24ms, it feels like it's doing the job of the display link.
This supplied best practices example isn't performant. It incurs all the stalls using a single command buffer.
https://developer.apple.com/library/archive/documentation/3DDrawing/Conceptual/MTLBestPracticesGuide/Drawables.html
Here's a more realistic example of what happens using a single command buffer.
beginCommandBuffer
beginEncoding
98% of commands to offscreen
endEncoding
<- this is where I currently end/commit the 98% command buffer to start driver processing
20ms+ stall on nextDrawable under heavy gpu load
beginEncoding
2% of commands to drawable (say a blit from offscreen to drawable)
endEncoding
1ms+ stall on presentDrawable (some stall then drawable.present added to addScheduledHandler)
endCommandBuffer
[cb commit] <- this is where commands are sent to queue and the driver
Ideally the nextDrawable and presentDrawable should be off in their own little core using thread, so the main thread on a big core isn't stalled out.
The case we have are 90 alpha blended quads on iOS that cause an 11ms gpu time + the rest of rendering. This then stalls the nextDrawable returning drawables to the pool, and with a single command buffer stalls processing the next frame and getting to the next cpu update.
There is also still no test for isDrawableAvailable in the pool.
Those functions don't deal with the small allocation sub-allocation strategy used with 128K buffers. So gpu capture tends to just report 128K for the sizes regardless of whether each buffer is shared inside the same 128K buffer. The tool needed to dedupe those allocations. I filed a forum post and feedback assistant on this, and Apple was looking to fix the issue.
This is happening even with Monterey macOS 12.0.1 on a 16" MBP. So this is a wide-ranging problem. Only Safari works for reading forum messages.
I'm trying to read gltf files which consist of json, a bin file, and a series of png images all in the same folder. How do I ship a viewer for this, if the viewer must be sandboxed, but the sandboxed app can only read the .gltf file that was supplied. The "Related Items" seems to imply the same name with different extensions, but here the image names can be anything.
Apple's ommision of gltf reader support in ModelIO requires me to supply my own reader, and I can't ask users to run usdzconvert first. But usdz has a similar file structure with separate images. All for sandboxing, except when it doesn't work. Also trying to submit this question in Chrome results in "Bad Message 431 reason: Request Header Fields Too Large". so then I had to switch to Safari.
Basically do users need to pick the folder containing the gltf, instead of the gltf file itself. Then the sandbox grants access to that folder and I can read-only any contents therein. This is rather unintuitive from the open panel to pick that. I've shipped poplular tools that had to be pulled from the App Store due to sandboxing.
DrawIndirect doesn't work well in iOS or macOS, since you can only submit one draw at a time referencing the buffer and offset above. There's also no stride to store additional instance data, or drawIndirectCount like in Vulkan, where the GPU supplies the count of things to draw. So it's not really saving much over making the draw calls themselves.
If you can target A9 which is where DrawIndirect started, then look into IndirectCommandBuffer which can then supply a range of draw calls which is the only way to submit a batch of commands as one submission.
That's not exactly going to work. Terrain system using R16Unorm for the precision, and it's the only format storable in PNG which only has 8u and 16u. Photoshop has been using 16u for a long time to store color, so I'm a little shocked that it's not exposed on desktop even if the Apple Silicon can't handle it.
R16Float is only 10-bits of precision, and R16Uint doesn't often support filtering. But worst case, the PNG data could be read into R16Uint. The Apple sample code should reflect valid use cases though.
If this is suddenly a per-format query, then Metal needs a call to query whether read-write support is possible on every format. Those docs are helpful, but runtime query is needed. Otherwise, how would an app test that R16Unit is supported and not R16Unorm.
Seems to be a bad bug in the Metal validation layer. It flags this texture RG16Unorm as unsupported, but it is supported. Turning off Metal validation for me fixes that sample app.
This is happening in Apple's own terrain sample code on macOS on the latest 16" Intel MBP with AMD 5500m. The Metal texture loader somehow loads an L16 png into an RG16Unorm texture since it provides no control over the MTLPixelFormat. Then when you click to modify the terrain with the mouse, the app crashes in the validation.
The textures say they support All, function texture read-write is true, and readWriteTexture support is Tier2. If Apple can't write a correct example that works, then we probably can't either.
MTLTextureDescriptor *texDesc =
[MTLTextureDescriptor texture2DDescriptorWithPixelFormat:
MTLPixelFormatRG16Unorm
//MTLPixelFormatR16Unorm <- should be this
width:heightMapWidth
height:heightMapHeight
mipmapped:NO];
2021-08-18 22:43:41.385581-0700 DynamicTerrainWithArgumentBuffers[46069:3553941] sample running on: AMD Radeon Pro 5500M
validateComputeFunctionArguments:854: failed assertion `Compute Function(TerrainKnl_UpdateHeightmap): Shader uses texture(heightMap[0]) as read-write, but hardware does not support read-write texture of this pixel format.'
validateComputeFunctionArguments:854: failed assertion `Compute Function(TerrainKnl_UpdateHeightmap): Shader uses texture(heightMap[0]) as read-write, but hardware does not support read-write texture of this pixel format.'
There's still the issue on iOS of where we can reload a new metallib from in development builds for the hotload. I'm assuming we have some access to a Downloads or public folder where the metallib can be copied into.
If you're dealing with GPU textures, then you need to copy them to a staging texture or buffer using the blit encoder. Then the bytes are available on the CPU. But the gpu is 1-3 frames ahead of the CPU. So you can't expect to read it back immediately.
Apple keeps shipping products like the new AppleTV with an A12 chips, so if that's in your market then you'll need a fallback. They are Tier1 devices that cannot index into or have pointers to Argument Buffers which is a key part of gpu-driven pipeline and ray-tracing. Fortunately M1 and A14 are replacing these old chips on macOS.
So there is no way to emulate multi-draw indirect count
There is. But you have to call drawPrimitive or drawIndexedPrimitive multiple times, each one indexing into the next indirect draw in the buffer. I don't know why Metal left the drawCount out of the api, but the current implementation has a drawLimit of 1. Nice thing is indirect draw works back to iPhone 5S.
You can even do GPU buffers with compute, and then indirect draw them. But you have to call draw 10 times, even if compute culls and produces 5 results, so make sure to set numInstances to 0 on the remaining draws. Or if you can wait a frame, then you could return a count to the cpu.