CIKernel call - performance hit and memory leak

I've been struggling with a bug with a Metal CIKernel call.

The kernel looks like this:

extern "C" float4 FeedbackIn(coreimage::sampler src, coreimage::sampler gradient, coreimage::sampler feedback, float time, float threshold, float optionValue, coreimage::destination dest )

{
   
  float4 input = src.sample(src.coord());
   
  return input; 

}


The call from Swift looks like this:

let result = ciKernel!.apply(extent: self.inputImage!.extent,
                  roiCallback: { [self]
         (index, rect) -> CGRect in
           let roiRect = rect.insetBy(dx: -1 * range, dy: -1 * range )
           return roiRect
         },
          arguments: [self.inputImage!, gradient!, feedback!, time, threshold, scaledOption])
        //       CIImage,      CIImage,  CIImage,  Float, Float,   Float

When I remove the 'feedback' from the Kernel and corresponding argument from the list in the Swift code, there is no issue. With the code as displayed, the export function will gradually slow down and memory size increase in Instruments Leaks. Instruments Leaks does not show any leaks.

I am running Xcode 13.3, macOS 12.1.1 on an iMac (Retina 5K, 27-inch, Late 2015).

Any insight on this is greatly appreciated! I don't know if passing the third CIImage is allowed or this is a bug.

Thanks to all in advance!

Answered by Boneoh in 711586022

I found a workaround. The CIImage that was queued for feedback had several Kernel filters applied to it. This seemed to be the cause of the memory issue. My workaround is to convert the CIImage to NSImage and then back to CIImage when adding it to the feedback queue. The CIImage from the NSImage has eliminated all of the intermediate operations that were in the original CIImage.

This is not very elegant, but it works. The virtual memory size is rock solid.

A little more detail about this. I am testing using a five-minute video 1920x1080.

This filter receives a CIImage that has been queued from a previous frame. So the feedback image is different for each frame. I'm guessing that somehow a reference to the feedback image is being retained but I don't know how I can prove it. The memory usage is steadily increasing over time. The memory use gradually increases to more than 6 GB.

I've save the .trace file, but unfortunately the forum won't allow me to post a link to DropBox.

I've also tried changing the call in the Swift to use weak vars and pass them. It did not help.

         // test                   weak var feedback2 = feedback         weak var gradient2 = gradient         weak var image2 = self.inputImage                   // performance issue arises when passing in the 'feedback' image!                   let result = ciKernel!.apply(extent: self.inputImage!.extent,                   roiCallback: { [self]          (index, rect) -> CGRect in            let roiRect = rect.insetBy(dx: -1 * range, dy: -1 * range )            return roiRect          },           arguments: [ image2!, gradient2!, feedback2!, time, threshold, scaledOption])         //       CIImage,      CIImage,  CIImage,  Float, Float,   Float         // FeedbackIn(coreimage::sampler src, coreimage::sampler gradient, coreimage::sampler feedback, float time, float threshold, float optionValue, coreimage::destination dest )

Apologies for the ****** formatting above. Adding a comment seems limited, and I don't want to post my comments as an answer.

In addition to the memory issue, I've found a performance issue.

In my original post, I passed three CIImage instances to the CIKernel. This caused a memory issue.

I removed one of the CIImage instances for testing, and the memory issue went away.

Next I found that the export processing speed continued to degrade. Exporting the first minute measured about 20 frames per second. Exporting the first two minutes degraded to about 9 frames per second. I stopped the third attempt after around five minutes.

I did another comparison. Passing the input source image and a gradient image for a two minute test took was about 28 frames per second. But when I pass the input image and a feedback image, it degraded to about 9 frames per second for a two minute export.

The difference is that the gradient CIImage is static throughout the processing. The feedback CIImage is a queued image delayed from ten frames previously. That's the only clue I have for the performance. The performance degrades quickly over time. It starts out relatively fast and quickly degrades.

I'm open to more debugging but am at a loss of how to dig any deeper on this.

You are capturing self in the roiCallback for no apparent reason. Can you please try to remove the [self] and check again?

@FrankSchlegel - I tried removing the self reference, it did not help. I saw your post 704665 https://developer.apple.com/forums/thread/704665 and it seems like it might be related. Next I am going to try returning null to see what happens. Thanks a bunch!

I changed the kernel to simply return the input pixel. I then tried passing nil from the program when invoking the kernel. No memory issue. When I changed the program to pass the feedback image, the virtual memory in Activity Monitor rapidly rises and the process grinds to a halt. Not sure what to look at next.

I also tried using CICOntext.clearCaches() but it did not help.

Any suggestions are appreciated! Thanks to all for reading.

I also tried just passing the two images to an Apple built-in Composite filter, and the same memory problem occurred. If I simply comment out the call to the built-in Composite, all is well. This makes me think there is something broken in the CoreImage processing in processing these feedback images. I'm about to give up on this....

If I simply 'return foreground!' or 'return background!' there is no memory issue. If I invoke the filter in the code below, the memory use rapidly increases and performance degrades. If left going long enough, it will use all virtual memory and crash.   

private static func CISourceOverCompositing() -> CIImage?   {           return background!           let result = CIFilter(        name: "CISourceOverCompositing",        parameters: [         "inputImage": foreground!,         "inputBackgroundImage": background!        ])?       .outputImage!.cropped(to: extent!)           return result!   }

Accepted Answer

I found a workaround. The CIImage that was queued for feedback had several Kernel filters applied to it. This seemed to be the cause of the memory issue. My workaround is to convert the CIImage to NSImage and then back to CIImage when adding it to the feedback queue. The CIImage from the NSImage has eliminated all of the intermediate operations that were in the original CIImage.

This is not very elegant, but it works. The virtual memory size is rock solid.

CIKernel call - performance hit and memory leak
 
 
Q