Hello,
I'm getting started for my project with Xcode Cloud since I upgraded to the macOS Sequioa Beta and Xcode 16 now refuses to archive builds for TestFlight.
Somewhere very late in the build process I get the following error:
realitytool requires Metal for this operation and it is not available in this build environment
The log says this happens at:
Compile Skybox urban.skybox
My project uses RealityKit. How can I fix this issue?
Thanks!
Metal
RSS for tagRender advanced 3D graphics and perform data-parallel computations using graphics processors using Metal.
Posts under Metal tag
200 Posts
Sort by:
Post
Replies
Boosts
Views
Activity
I am searching for a method to remove background from a video. it can be from camera Session fileOutput url or from photo library.
I was able to accomplish live preview of removed background with the depth data and some metal framework code from the example Enhancing Live Video by Leveraging TrueDepth Camera Data. However I count figure out a way to save this as a video so that I can upload it.
Also this method is using over 150% of cpu ( Xcode cpu usage ), which seems to be quite a lot and the device is getting heated up so fast and drops the frames when It hot.
I also found something similar from GitHub using CoreML example by Dmitry Voitekh which only uses less than 40% cpu.
Any information regarding this will be helpful.
Objective : Remove Background from video and save it
I'm trying to ray-march an SDF inside a RealityKit surface shader. For the SDF primitive to correctly render with other primitives, the depth of the fragment needs to be set according to the ray-surface intersection point. Is there a way to do that within a RealityKit surface shader? It seems the only values I can set are within surface::surface_properties.
If not, can an SDF still be rendered in RealityKit using ray-marching?
Hello,
I want to create a painting app for iOS and I saw many examples use a CAShapeLayer to draw a UIBezierPath.
As I understand CoreAnimation uses the GPU so I was wondering how is this implemented on the GPU? Or in other words, how would you do it with Metal or OpenGL?
I can only think of continuously updating a texture in response to the user's drawing but that would be a very resource intensive operation...
Thanks
Guten Tag,
my project is simple, first I want draw wired Hexa,-Tetra- and Octahedrons.
I draw a cube with Metal but I didn't found rotation, translation and scale.
I have searched help , the examples I found are too complicated for me.
Mit freundlichen Grüßen
VanceRegnet
I’ve built a iOS camera app that applies many CIFilters to an image captured by the camera. Some of my users have reported that on occasion the images have large parts that are blank, see below:
Frustratingly, I can’t reproduce this myself! Does anyone know what could he causing it, is it a memory issue? I haven’t posted the code as there’s a lot to look over and I’m not sure it would help diagnose it.
Thanks for any pointers.
Hello.
When displaying a simple app like this:
struct ContentView: View {
var body: some View {
EmptyView()
}
}
And run the Leaks app from the developer tools in Xcode, I see a memory leak which I don't see when running the same application on iOS.
You can simply run the app and it will show a memory leak. And this is what I see in the Leaks application.
Any ideas on what is going on?
Thanks!
In my Metal-based app, I ray-march a 3D texture. I'd like to use RealityKit instead of my own code. I see there is a LowLevelTexture (beta) where I could specify a 3D texture. However on the Metal side, there doesn't seem to be any way to access a 3D texture (realitykit::texture::textures::custom returns a texture2d).
Any work-arounds? Could I even do something icky like cast the texture2d to a texture3d in MSL? (is that even possible?) Could I encode the 3d texture into an argument buffer and get that in somehow?
I am trying to convert a ThreeJS project to Metal for the Vision Pro. The issue is ThreeJS doesn't do any color space conversion (when I output a color in a fragment shader and then read it using the digital color meter in SRGB mode I get the same value I inputed in the fragment shader) This is not the case when using metal. When setting up my LayerRenderer I set the colorFormat to rgba16Unorm since it is the only non srgb color format supported on the vision pro apps. However switching between bgra8Unorm_srgb and rgba16Unorm seems to have no affect.
when I set up the renderPassDescriptor I use the drawable colorTexture
renderPassDescriptor.colorAttachments[0].texture = drawable.colorTextures[0]
and when printing its pixel format it seems to be passed from the configuration.
If there is anyway to disable this behavior or perform an inverse function of such that I get the original value out from the shader, that would be appreciated.
arScnView = ARSCNView(frame: CGRect.zero, options: nil)
arScnView.delegate = self
arScnView.automaticallyUpdatesLighting = true
arScnView.allowsCameraControl = true
addSubview(arScnView)
arSession = arScnView.session
arSession.delegate = self
config = ARWorldTrackingConfiguration()
config.sceneReconstruction = .meshWithClassification
config.environmentTexturing = .automatic
func session(_ session: ARSession, didAdd anchors: [ARAnchor])
{
anchors.forEach({ anchor in
if let meshAnchor = anchor as? ARMeshAnchor {
let node = meshAnchor.toSCNNode()
self.arScnView.scene.rootNode.addChildNode(node)
}
if let environmentProbeAnchor = anchor as? AREnvironmentProbeAnchor {
// Can I retrieve the texture map corresponding to ARMeshAnchor from Environment Probe Anchor?
// Or how can I retrieve the texture map corresponding to ARMeshAnchor?
}
})
}
How can I scan a 3D scene and save it as USDZ?
I want to achieve the following scenario?
I'm trying to create a custom Metal-based visual effect as a UIView to be used inside an existing UIKit-based interface. (An example might be a view that applies a blur effect to what's behind it.) I need to capture the MTLTexture of what's behind the view so that I can feed it to MTLRenderCommandEncoder.setFragmentTexture(_:index:). Can someone show me how or point me to an example? Thanks!
Greetings! I have been battling with a bit of a tough issue. My use case is running a pixelwise regression model on a 2D array of images using CIImageProcessorKernel and a custom Metal Shader.
It mostly works great, but the issue that arises is that if the regression calculation in Metal takes too long, an error occurs and the resulting output texture has strange artifacts, for example:
The specific error is:
Error excuting command buffer = Error Domain=MTLCommandBufferErrorDomain Code=1 "Internal Error (0000000e:Internal Error)" UserInfo={NSLocalizedDescription=Internal Error (0000000e:Internal Error), NSUnderlyingError=0x60000320ca20 {Error Domain=IOGPUCommandQueueErrorDomain Code=14 "(null)"}} (com.apple.CoreImage)
There are multiple levels of concurrency: Swift Concurrency calling the Core Image code (which shouldn't have an impact) and of course the Metal command buffer.
Is there anyway to ensure the compute command encoder can complete its work?
Here is the full implementation of my CIImageProcessorKernel subclass:
class ParametricKernel: CIImageProcessorKernel {
static let device = MTLCreateSystemDefaultDevice()!
override class var outputFormat: CIFormat {
return .BGRA8
}
override class func formatForInput(at input: Int32) -> CIFormat {
return .BGRA8
}
override class func process(with inputs: [CIImageProcessorInput]?, arguments: [String : Any]?, output: CIImageProcessorOutput) throws {
guard
let commandBuffer = output.metalCommandBuffer,
let images = arguments?["images"] as? [CGImage],
let mask = arguments?["mask"] as? CGImage,
let fillTime = arguments?["fillTime"] as? CGFloat,
let betaLimit = arguments?["betaLimit"] as? CGFloat,
let alphaLimit = arguments?["alphaLimit"] as? CGFloat,
let errorScaling = arguments?["errorScaling"] as? CGFloat,
let timing = arguments?["timing"],
let TTRThreshold = arguments?["ttrthreshold"] as? CGFloat,
let input = inputs?.first,
let sourceTexture = input.metalTexture,
let destinationTexture = output.metalTexture
else {
return
}
guard let kernelFunction = device.makeDefaultLibrary()?.makeFunction(name: "parametric") else {
return
}
guard let commandEncoder = commandBuffer.makeComputeCommandEncoder() else {
return
}
let imagesTexture = Texture.textureFromImages(images)
let pipelineState = try device.makeComputePipelineState(function: kernelFunction)
commandEncoder.setComputePipelineState(pipelineState)
commandEncoder.setTexture(imagesTexture, index: 0)
let maskTexture = Texture.textureFromImages([mask])
commandEncoder.setTexture(maskTexture, index: 1)
commandEncoder.setTexture(destinationTexture, index: 2)
var errorScalingFloat = Float(errorScaling)
let errorBuffer = device.makeBuffer(bytes: &errorScalingFloat, length: MemoryLayout<Float>.size, options: [])
commandEncoder.setBuffer(errorBuffer, offset: 0, index: 1)
// Other buffers omitted....
let threadsPerThreadgroup = MTLSizeMake(16, 16, 1)
let width = Int(ceil(Float(sourceTexture.width) / Float(threadsPerThreadgroup.width)))
let height = Int(ceil(Float(sourceTexture.height) / Float(threadsPerThreadgroup.height)))
let threadGroupCount = MTLSizeMake(width, height, 1)
commandEncoder.dispatchThreadgroups(threadGroupCount, threadsPerThreadgroup: threadsPerThreadgroup)
commandEncoder.endEncoding()
}
}
The Metal feature set tables specifies that beginning with the Apple4 family, the "Maximum threads per threadgroup" is 1024. Given that a single threadgroup is guaranteed to be run on the same GPU shader core, it means that a shader core of any new Apple GPU must be capable of running at least 1024/32 = 32 warps in parallel.
From the WWDC session "Scale compute workloads across Apple GPUs (6:17)":
For relatively complex kernels, 1K to 2K concurrent threads per shader core is considered a very good occupancy.
The cited sentence suggests that a single shader core is capable of running at least 2K (I assume this is meant to be 2048) threads in parallel, so 2048/32 = 64 warps running in parallel.
However, I am curious what is the maximum theoretical amount of warps running in parallel on a single shader core (it sounds like it is more than 64). The WWDC session mentions 2K to be only "very good" occupancy. How many threads would be "the best possible" occupancy?
Our app encountered the following error:
Execution of the command buffer was aborted due to an error during execution. Ignored (for causing prior/excessive GPU errors) (00000004:kIOGPUCommandBufferCallbackErrorSubmissionsIgnored)
How many 32-bit variables can I use concurrently in a single thread of a Metal compute kernel without worrying about the variables getting spilled into the device memory? Alternatively: how many 32-bit registers does a single thread have available for itself?
Let's say that each thread of my compute kernel needs to store and work with its own array of N float variables, where N can be 128, 256, 512 or more. To achieve maximum possible performance, I do not want to the local thread variables to get spilled into the slow device memory. I want all N variables to be stored "on-chip", in the thread memory space.
To make my question more concrete, let's say there is an array thread float localArray[N]. Assuming an unrealistic hypothetical scenario where localArray is the only variable in the whole kernel, what is the maximum value of N for which no portion of localArray would get spilled into the device memory?
I searched in the Metal feature set tables, but I could not find any details.
How is it possible to enable EDR on Apple TV without AVFoundation for custom HDR video playback? The use case is a custom video player for HDR playback via VideoToolbox and Metal, which seem to render colors correctly on iOS but not on tvOS.
All related documentation and WWDC sessions describe APIs that are unavailable for tvOS:
let metalLayer = CAMetalLayer()
metalLayer.wantsExtendedDynamicRangeContent = true
metalLayer.edrMetadata = CAEDRMetadata.hdr10(minLuminance: 0.0, maxLuminance: 1000, opticalOutputScale: 100)
What's the alternative path for tvOS to have correct system tone mapping for a setup like:
metalLayer.pixelFormat = .rgba16Float // (or .bgr10_xr)
metalLayer.colorspace = CGColorSpace(name: CGColorSpace.itur_2100_PQ)
Video format: HEVC, YUV 4:2:0 10bit, BT.2020 PQ.
We do set the preferredDisplayCriteria on AVDisplayManager and thus video range matching is in place.
WWDC Ref: https://developer.apple.com/videos/play/wwdc2022/110565?time=557
I'm making an app that reads a ProRes file, processes each frame through metal to resize and scale it, then outputs a new ProRes file. In the future the app will support other codecs but for now just ProRes. I'm reading the ProRes 422 buffers in the kCVPixelFormatType_422YpCbCr16 pixel format. This is what's recommended by Apple in this video https://developer.apple.com/wwdc20/10090?time=599.
When the MTLTexture is run through a metal performance shader, the colorspace seems to force RGB or is just not allowing yCbCr textures as the output is all green/purple. If you look at the render code, you will see there's a commented out block of code to just blit copy the outputTexture, if you perform the copy instead of the scaling through MPS, the output colorspace is fine. So it appears the issue is from Metal Performance Shaders.
Side note - I noticed that when using this format, it brings in the YpCbCr texture as a single plane. I thought it's preferred to handle this as two separate planes? That said, if I have two separate planes, that makes my app more complicated as I would need to scale both planes or merge it to RGB. But I'm going for the most performance possible.
A sample project can be found here: https://www.dropbox.com/scl/fo/jsfwh9euc2ns2o3bbmyhn/AIomDYRhxCPVaWw9XH-qaN0?rlkey=sp8g0sb86af1u44p3xy9qa3b9&dl=0
Inside the supporting files, there is a test movie. For ease, I would move this to somewhere easily accessible (i.e Desktop).
Load and run the example project.
Click 'Select Video'
Select that video you placed on your desktop
It will now output a new video next to the selected one, named "Output.mov"
The new video should just be scaled at 50%, but the colorspace is all wrong.
Below is a photo of before and after the metal performance shader.
I tried to understand the view matrix.
The part from original code as below:
private func updateGameState() {
/// Update any game state before rendering
uniforms[0].projectionMatrix = projectionMatrix
let rotationAxis = SIMD3<Float>(1, 1, 0)
let modelMatrix = matrix4x4_rotation(radians: rotation, axis: rotationAxis)
let viewMatrix = matrix4x4_translation(0.0, 0.0, -8.0)
uniforms[0].modelViewMatrix = simd_mul(viewMatrix, modelMatrix)
rotation += 0.01
}
If the view matrix is initialed in x = -0.5, as:let viewMatrix = matrix4x4_translation(-0.5, 0.0, -8.0)
The cube in the MetalView will move left.
I think it should move to right hand side because View Matrix is camera position, am I wrong?
We have been having a mysterious crash in our media server app that I've never seen before. After fixing a number of other rare thread safety crashes relating to Metal buffers, this rare crash happens inside a Metal com.Metal.CompletionQueueDispatch?
I have no clue what is happening here. It looks to me like Metal is specifically calling abort() for some reason.
All of the other threads in the crash log appear to be in a normal state.
Thread 70 Crashed:: updateAllMedia Dispatch queue: com.Metal.CompletionQueueDispatch
0 libsystem_kernel.dylib 0x1af572d38 __pthread_kill + 8
1 libsystem_pthread.dylib 0x1af5a7ee0 pthread_kill + 288
2 libsystem_c.dylib 0x1af4e2330 abort + 168
3 libc++abi.dylib 0x1af562b18 abort_message + 132
4 libc++abi.dylib 0x1af552a3c demangling_terminate_handler() + 312
5 libobjc.A.dylib 0x1af4481c8 _objc_terminate() + 160
6 libc++abi.dylib 0x1af561eb4 std::__terminate(void (*)()) + 20
7 libc++abi.dylib 0x1af561e50 std::terminate() + 64
8 libdispatch.dylib 0x1af3e4288 _dispatch_client_callout4 + 40
9 libdispatch.dylib 0x1af40053c _dispatch_mach_msg_invoke + 464
10 libdispatch.dylib 0x1af3eb784 _dispatch_lane_serial_drain + 376
11 libdispatch.dylib 0x1af40125c _dispatch_mach_invoke + 456
12 libdispatch.dylib 0x1af3eb784 _dispatch_lane_serial_drain + 376
13 libdispatch.dylib 0x1af3ec438 _dispatch_lane_invoke + 444
14 libdispatch.dylib 0x1af3eb784 _dispatch_lane_serial_drain + 376
15 libdispatch.dylib 0x1af3ec404 _dispatch_lane_invoke + 392
16 libdispatch.dylib 0x1af3f6c98 _dispatch_workloop_worker_thread + 648
17 libsystem_pthread.dylib 0x1af5a4360 _pthread_wqthread + 288
18 libsystem_pthread.dylib 0x1af5a3080 start_wqthread + 8
Note that the thread name "updateAllMedia" is a misnomer because this thread appears to be a general Metal dispatch queue. I wish there was a debugging option in Metal that called "setThreadName" to name its internal threads.
I use quad_sum to optimize the lighting grid and shadow filter performance.
Based on Metal Feature Set Tables, Apple Family 4 should support quad group operations like quad_sum and quad_max. However, on the iPhone X and iPhone 8, during creating pipeline states, we have the following error output: Encountered unlowered function call to air.quad_sum.f32.
It works perfectly for iPhone 11 and higher versions. Should I improve my feature-checking logic from Apple Family 4 to Apple Family 5, or do I have other options to fix this unexpected behavior?