Render advanced 3D graphics and perform data-parallel computations using graphics processors using Metal.

Posts under Metal tag

200 Posts
Sort by:

Post

Replies

Boosts

Views

Activity

realitytool requires Metal for this operation and it is not available in this build environment
Hello, I'm getting started for my project with Xcode Cloud since I upgraded to the macOS Sequioa Beta and Xcode 16 now refuses to archive builds for TestFlight. Somewhere very late in the build process I get the following error: realitytool requires Metal for this operation and it is not available in this build environment The log says this happens at: Compile Skybox urban.skybox My project uses RealityKit. How can I fix this issue? Thanks!
2
1
391
Sep ’24
Video Background Removal
I am searching for a method to remove background from a video. it can be from camera Session fileOutput url or from photo library. I was able to accomplish live preview of removed background with the depth data and some metal framework code from the example Enhancing Live Video by Leveraging TrueDepth Camera Data. However I count figure out a way to save this as a video so that I can upload it. Also this method is using over 150% of cpu ( Xcode cpu usage ), which seems to be quite a lot and the device is getting heated up so fast and drops the frames when It hot. I also found something similar from GitHub using CoreML example by Dmitry Voitekh which only uses less than 40% cpu. Any information regarding this will be helpful. Objective : Remove Background from video and save it
5
0
752
Sep ’24
Can an SDF be rendered using RealityKit?
I'm trying to ray-march an SDF inside a RealityKit surface shader. For the SDF primitive to correctly render with other primitives, the depth of the fragment needs to be set according to the ray-surface intersection point. Is there a way to do that within a RealityKit surface shader? It seems the only values I can set are within surface::surface_properties. If not, can an SDF still be rendered in RealityKit using ray-marching?
1
1
467
Sep ’24
How is CAShapeLayer implemented
Hello, I want to create a painting app for iOS and I saw many examples use a CAShapeLayer to draw a UIBezierPath. As I understand CoreAnimation uses the GPU so I was wondering how is this implemented on the GPU? Or in other words, how would you do it with Metal or OpenGL? I can only think of continuously updating a texture in response to the user's drawing but that would be a very resource intensive operation... Thanks
3
0
453
Sep ’24
CIFilter chain failing to render parts of output
I’ve built a iOS camera app that applies many CIFilters to an image captured by the camera. Some of my users have reported that on occasion the images have large parts that are blank, see below: Frustratingly, I can’t reproduce this myself! Does anyone know what could he causing it, is it a memory issue? I haven’t posted the code as there’s a lot to look over and I’m not sure it would help diagnose it. Thanks for any pointers.
1
0
400
Sep ’24
Memory Leak using simple app with visionOS
Hello. When displaying a simple app like this: struct ContentView: View { var body: some View { EmptyView() } } And run the Leaks app from the developer tools in Xcode, I see a memory leak which I don't see when running the same application on iOS. You can simply run the app and it will show a memory leak. And this is what I see in the Leaks application. Any ideas on what is going on? Thanks!
2
0
432
Sep ’24
RealityKit 3D texture
In my Metal-based app, I ray-march a 3D texture. I'd like to use RealityKit instead of my own code. I see there is a LowLevelTexture (beta) where I could specify a 3D texture. However on the Metal side, there doesn't seem to be any way to access a 3D texture (realitykit::texture::textures::custom returns a texture2d). Any work-arounds? Could I even do something icky like cast the texture2d to a texture3d in MSL? (is that even possible?) Could I encode the 3d texture into an argument buffer and get that in somehow?
1
0
440
Aug ’24
Disable Automatic Color Space conversion on Vision Pro Metal Shader
I am trying to convert a ThreeJS project to Metal for the Vision Pro. The issue is ThreeJS doesn't do any color space conversion (when I output a color in a fragment shader and then read it using the digital color meter in SRGB mode I get the same value I inputed in the fragment shader) This is not the case when using metal. When setting up my LayerRenderer I set the colorFormat to rgba16Unorm since it is the only non srgb color format supported on the vision pro apps. However switching between bgra8Unorm_srgb and rgba16Unorm seems to have no affect. when I set up the renderPassDescriptor I use the drawable colorTexture renderPassDescriptor.colorAttachments[0].texture = drawable.colorTextures[0] and when printing its pixel format it seems to be passed from the configuration. If there is anyway to disable this behavior or perform an inverse function of such that I get the original value out from the shader, that would be appreciated.
0
0
376
Aug ’24
To use ARSCNView to capture a 3D model of a scene and obtain the mesh information, how can I retrieve the texture information for the mesh?
arScnView = ARSCNView(frame: CGRect.zero, options: nil) arScnView.delegate = self arScnView.automaticallyUpdatesLighting = true arScnView.allowsCameraControl = true addSubview(arScnView) arSession = arScnView.session arSession.delegate = self config = ARWorldTrackingConfiguration() config.sceneReconstruction = .meshWithClassification config.environmentTexturing = .automatic func session(_ session: ARSession, didAdd anchors: [ARAnchor]) { anchors.forEach({ anchor in if let meshAnchor = anchor as? ARMeshAnchor { let node = meshAnchor.toSCNNode() self.arScnView.scene.rootNode.addChildNode(node) } if let environmentProbeAnchor = anchor as? AREnvironmentProbeAnchor { // Can I retrieve the texture map corresponding to ARMeshAnchor from Environment Probe Anchor? // Or how can I retrieve the texture map corresponding to ARMeshAnchor? } }) } How can I scan a 3D scene and save it as USDZ? I want to achieve the following scenario?
0
0
349
Sep ’24
Metal UIView to transform what's behind it
I'm trying to create a custom Metal-based visual effect as a UIView to be used inside an existing UIKit-based interface. (An example might be a view that applies a blur effect to what's behind it.) I need to capture the MTLTexture of what's behind the view so that I can feed it to MTLRenderCommandEncoder.setFragmentTexture(_:index:). Can someone show me how or point me to an example? Thanks!
2
0
414
Oct ’24
CIImageProcessorKernel using Metal Compute Pipeline error
Greetings! I have been battling with a bit of a tough issue. My use case is running a pixelwise regression model on a 2D array of images using CIImageProcessorKernel and a custom Metal Shader. It mostly works great, but the issue that arises is that if the regression calculation in Metal takes too long, an error occurs and the resulting output texture has strange artifacts, for example: The specific error is: Error excuting command buffer = Error Domain=MTLCommandBufferErrorDomain Code=1 "Internal Error (0000000e:Internal Error)" UserInfo={NSLocalizedDescription=Internal Error (0000000e:Internal Error), NSUnderlyingError=0x60000320ca20 {Error Domain=IOGPUCommandQueueErrorDomain Code=14 "(null)"}} (com.apple.CoreImage) There are multiple levels of concurrency: Swift Concurrency calling the Core Image code (which shouldn't have an impact) and of course the Metal command buffer. Is there anyway to ensure the compute command encoder can complete its work? Here is the full implementation of my CIImageProcessorKernel subclass: class ParametricKernel: CIImageProcessorKernel { static let device = MTLCreateSystemDefaultDevice()! override class var outputFormat: CIFormat { return .BGRA8 } override class func formatForInput(at input: Int32) -> CIFormat { return .BGRA8 } override class func process(with inputs: [CIImageProcessorInput]?, arguments: [String : Any]?, output: CIImageProcessorOutput) throws { guard let commandBuffer = output.metalCommandBuffer, let images = arguments?["images"] as? [CGImage], let mask = arguments?["mask"] as? CGImage, let fillTime = arguments?["fillTime"] as? CGFloat, let betaLimit = arguments?["betaLimit"] as? CGFloat, let alphaLimit = arguments?["alphaLimit"] as? CGFloat, let errorScaling = arguments?["errorScaling"] as? CGFloat, let timing = arguments?["timing"], let TTRThreshold = arguments?["ttrthreshold"] as? CGFloat, let input = inputs?.first, let sourceTexture = input.metalTexture, let destinationTexture = output.metalTexture else { return } guard let kernelFunction = device.makeDefaultLibrary()?.makeFunction(name: "parametric") else { return } guard let commandEncoder = commandBuffer.makeComputeCommandEncoder() else { return } let imagesTexture = Texture.textureFromImages(images) let pipelineState = try device.makeComputePipelineState(function: kernelFunction) commandEncoder.setComputePipelineState(pipelineState) commandEncoder.setTexture(imagesTexture, index: 0) let maskTexture = Texture.textureFromImages([mask]) commandEncoder.setTexture(maskTexture, index: 1) commandEncoder.setTexture(destinationTexture, index: 2) var errorScalingFloat = Float(errorScaling) let errorBuffer = device.makeBuffer(bytes: &errorScalingFloat, length: MemoryLayout<Float>.size, options: []) commandEncoder.setBuffer(errorBuffer, offset: 0, index: 1) // Other buffers omitted.... let threadsPerThreadgroup = MTLSizeMake(16, 16, 1) let width = Int(ceil(Float(sourceTexture.width) / Float(threadsPerThreadgroup.width))) let height = Int(ceil(Float(sourceTexture.height) / Float(threadsPerThreadgroup.height))) let threadGroupCount = MTLSizeMake(width, height, 1) commandEncoder.dispatchThreadgroups(threadGroupCount, threadsPerThreadgroup: threadsPerThreadgroup) commandEncoder.endEncoding() } }
3
0
554
Aug ’24
How many warps can be run in parallel on a single shader core?
The Metal feature set tables specifies that beginning with the Apple4 family, the "Maximum threads per threadgroup" is 1024. Given that a single threadgroup is guaranteed to be run on the same GPU shader core, it means that a shader core of any new Apple GPU must be capable of running at least 1024/32 = 32 warps in parallel. From the WWDC session "Scale compute workloads across Apple GPUs (6:17)": For relatively complex kernels, 1K to 2K concurrent threads per shader core is considered a very good occupancy. The cited sentence suggests that a single shader core is capable of running at least 2K (I assume this is meant to be 2048) threads in parallel, so 2048/32 = 64 warps running in parallel. However, I am curious what is the maximum theoretical amount of warps running in parallel on a single shader core (it sounds like it is more than 64). The WWDC session mentions 2K to be only "very good" occupancy. How many threads would be "the best possible" occupancy?
1
0
450
Aug ’24
In Metal compute kernels, when do thread variables get spilled into the device memory?
How many 32-bit variables can I use concurrently in a single thread of a Metal compute kernel without worrying about the variables getting spilled into the device memory? Alternatively: how many 32-bit registers does a single thread have available for itself? Let's say that each thread of my compute kernel needs to store and work with its own array of N float variables, where N can be 128, 256, 512 or more. To achieve maximum possible performance, I do not want to the local thread variables to get spilled into the slow device memory. I want all N variables to be stored "on-chip", in the thread memory space. To make my question more concrete, let's say there is an array thread float localArray[N]. Assuming an unrealistic hypothetical scenario where localArray is the only variable in the whole kernel, what is the maximum value of N for which no portion of localArray would get spilled into the device memory? I searched in the Metal feature set tables, but I could not find any details.
0
0
333
Aug ’24
Enabling EDR on Apple TV for custom HDR video playback with VideoToolbox
How is it possible to enable EDR on Apple TV without AVFoundation for custom HDR video playback? The use case is a custom video player for HDR playback via VideoToolbox and Metal, which seem to render colors correctly on iOS but not on tvOS. All related documentation and WWDC sessions describe APIs that are unavailable for tvOS: let metalLayer = CAMetalLayer() metalLayer.wantsExtendedDynamicRangeContent = true metalLayer.edrMetadata = CAEDRMetadata.hdr10(minLuminance: 0.0, maxLuminance: 1000, opticalOutputScale: 100) What's the alternative path for tvOS to have correct system tone mapping for a setup like: metalLayer.pixelFormat = .rgba16Float // (or .bgr10_xr) metalLayer.colorspace = CGColorSpace(name: CGColorSpace.itur_2100_PQ) Video format: HEVC, YUV 4:2:0 10bit, BT.2020 PQ. We do set the preferredDisplayCriteria on AVDisplayManager and thus video range matching is in place. WWDC Ref: https://developer.apple.com/videos/play/wwdc2022/110565?time=557
1
1
371
Aug ’24
Metal Performance Shader color issue with yCbCr buffer
I'm making an app that reads a ProRes file, processes each frame through metal to resize and scale it, then outputs a new ProRes file. In the future the app will support other codecs but for now just ProRes. I'm reading the ProRes 422 buffers in the kCVPixelFormatType_422YpCbCr16 pixel format. This is what's recommended by Apple in this video https://developer.apple.com/wwdc20/10090?time=599. When the MTLTexture is run through a metal performance shader, the colorspace seems to force RGB or is just not allowing yCbCr textures as the output is all green/purple. If you look at the render code, you will see there's a commented out block of code to just blit copy the outputTexture, if you perform the copy instead of the scaling through MPS, the output colorspace is fine. So it appears the issue is from Metal Performance Shaders. Side note - I noticed that when using this format, it brings in the YpCbCr texture as a single plane. I thought it's preferred to handle this as two separate planes? That said, if I have two separate planes, that makes my app more complicated as I would need to scale both planes or merge it to RGB. But I'm going for the most performance possible. A sample project can be found here: https://www.dropbox.com/scl/fo/jsfwh9euc2ns2o3bbmyhn/AIomDYRhxCPVaWw9XH-qaN0?rlkey=sp8g0sb86af1u44p3xy9qa3b9&dl=0 Inside the supporting files, there is a test movie. For ease, I would move this to somewhere easily accessible (i.e Desktop). Load and run the example project. Click 'Select Video' Select that video you placed on your desktop It will now output a new video next to the selected one, named "Output.mov" The new video should just be scaled at 50%, but the colorspace is all wrong. Below is a photo of before and after the metal performance shader.
3
0
459
Aug ’24
View Matrix in Metal "Game" Sample code
I tried to understand the view matrix. The part from original code as below: private func updateGameState() { /// Update any game state before rendering uniforms[0].projectionMatrix = projectionMatrix let rotationAxis = SIMD3<Float>(1, 1, 0) let modelMatrix = matrix4x4_rotation(radians: rotation, axis: rotationAxis) let viewMatrix = matrix4x4_translation(0.0, 0.0, -8.0) uniforms[0].modelViewMatrix = simd_mul(viewMatrix, modelMatrix) rotation += 0.01 } If the view matrix is initialed in x = -0.5, as:let viewMatrix = matrix4x4_translation(-0.5, 0.0, -8.0) The cube in the MetalView will move left. I think it should move to right hand side because View Matrix is camera position, am I wrong?
0
0
332
Aug ’24
crash log shows abort() called inside Metal driver code?
We have been having a mysterious crash in our media server app that I've never seen before. After fixing a number of other rare thread safety crashes relating to Metal buffers, this rare crash happens inside a Metal com.Metal.CompletionQueueDispatch? I have no clue what is happening here. It looks to me like Metal is specifically calling abort() for some reason. All of the other threads in the crash log appear to be in a normal state. Thread 70 Crashed:: updateAllMedia Dispatch queue: com.Metal.CompletionQueueDispatch 0 libsystem_kernel.dylib 0x1af572d38 __pthread_kill + 8 1 libsystem_pthread.dylib 0x1af5a7ee0 pthread_kill + 288 2 libsystem_c.dylib 0x1af4e2330 abort + 168 3 libc++abi.dylib 0x1af562b18 abort_message + 132 4 libc++abi.dylib 0x1af552a3c demangling_terminate_handler() + 312 5 libobjc.A.dylib 0x1af4481c8 _objc_terminate() + 160 6 libc++abi.dylib 0x1af561eb4 std::__terminate(void (*)()) + 20 7 libc++abi.dylib 0x1af561e50 std::terminate() + 64 8 libdispatch.dylib 0x1af3e4288 _dispatch_client_callout4 + 40 9 libdispatch.dylib 0x1af40053c _dispatch_mach_msg_invoke + 464 10 libdispatch.dylib 0x1af3eb784 _dispatch_lane_serial_drain + 376 11 libdispatch.dylib 0x1af40125c _dispatch_mach_invoke + 456 12 libdispatch.dylib 0x1af3eb784 _dispatch_lane_serial_drain + 376 13 libdispatch.dylib 0x1af3ec438 _dispatch_lane_invoke + 444 14 libdispatch.dylib 0x1af3eb784 _dispatch_lane_serial_drain + 376 15 libdispatch.dylib 0x1af3ec404 _dispatch_lane_invoke + 392 16 libdispatch.dylib 0x1af3f6c98 _dispatch_workloop_worker_thread + 648 17 libsystem_pthread.dylib 0x1af5a4360 _pthread_wqthread + 288 18 libsystem_pthread.dylib 0x1af5a3080 start_wqthread + 8 Note that the thread name "updateAllMedia" is a misnomer because this thread appears to be a general Metal dispatch queue. I wish there was a debugging option in Metal that called "setThreadName" to name its internal threads.
1
0
509
Aug ’24
Does iPhone X support quad_sum operation?
I use quad_sum to optimize the lighting grid and shadow filter performance. Based on Metal Feature Set Tables, Apple Family 4 should support quad group operations like quad_sum and quad_max. However, on the iPhone X and iPhone 8, during creating pipeline states, we have the following error output: Encountered unlowered function call to air.quad_sum.f32. It works perfectly for iPhone 11 and higher versions. Should I improve my feature-checking logic from Apple Family 4 to Apple Family 5, or do I have other options to fix this unexpected behavior?
1
0
461
Aug ’24