Why isn't my face detection code using CIDetector working properly?

I'm trying to detect faces in my iOS camera app, but it doesn't work properly, while it works properly in Camera.app. Notice that:
  • The first face isns't detected in my app, only in Camera.app.
  • For the third face — the east asian woman — Camera.app correctly draws a rectangle around her face, while my app draws a rectangle that extends far below her face.
  • Obama's face isn't detected in my app, only in Camera.app.
  • When the camera zooms out from Putin's face, my app draws a rectangle over the right half of his face, cutting it in half, while Camera.app draws a rectangle correctly around his face.


Why is this happening?


My code is as follows. Do you see anything wrong?


First, I create a video output as follows:


let videoOutput = AVCaptureVideoDataOutput()
videoOutput.videoSettings =
    [kCVPixelBufferPixelFormatTypeKey as AnyHashable:
    Int(kCMPixelFormat_32BGRA)]
session.addOutput(videoOutput)
videoOutput.setSampleBufferDelegate(faceDetector, queue: faceDetectionQueue)


This is the delegate:


class FaceDetector: NSObject, AVCaptureVideoDataOutputSampleBufferDelegate {
  func captureOutput(_ captureOutput: AVCaptureOutput!,
                     didOutputSampleBuffer sampleBuffer: CMSampleBuffer!,
                     from connection: AVCaptureConnection!) {
    let imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer)!
    let features = FaceDetector.ciDetector.features(
        in: CIImage(cvPixelBuffer: imageBuffer))


    let faces = features.map { $0.bounds }
    let imageSize = CVImageBufferGetDisplaySize(imageBuffer)


    let faceBounds = faces.map { (face: CIFeature) -> CGRect in
        var ciBounds = face.bounds


        ciBounds = ciBounds.applying(
            CGAffineTransform(scaleX: 1/imageSize.width, y: -1/imageSize.height))
        CGRect(x: 0, y: 0, width: 1, height: -1).verifyContains(ciBounds)


        let bounds = ciBounds.applying(CGAffineTransform(translationX: 0, y: 1.0))
        CGRect(x: 0, y: 0, width: 1, height: 1).verifyContains(bounds)
        return bounds
    }
    DispatchQueue.main.sync {
      facesUpdated(faceBounds, imageSize)
    }
  }

  private static let ciDetector = CIDetector(ofType: CIDetectorTypeFace,
      context: nil,
      options: [CIDetectorAccuracy: CIDetectorAccuracyHigh])!
}


The facesUpdated() callback is as follows:


class PreviewView: UIView {
  private var faceRects = [UIView]()


  private static func makeFaceRect() -> UIView {
    let r = UIView()
    r.layer.borderWidth = FocusRect.borderWidth
    r.layer.borderColor = FocusRect.color.cgColor
    faceRects.append(r)
    addSubview(r)
    return r
  }


  private func removeAllFaceRects() {
    for faceRect in faceRects {
      verify(faceRect.superview == self)
      faceRect.removeFromSuperview()
    }
    faceRects.removeAll()
  }


  private func facesUpdated(_ faces: [CGRect], _ imageSize: CGSize) {
    removeAllFaceRects()


    let faceFrames = faces.map { (original: CGRect) -> CGRect in
        let face = original.applying(CGAffineTransform(scaleX: bounds.width, y: bounds.height))
        verify(self.bounds.contains(face))
        return face
    }


    for faceFrame in faceFrames {
      let faceRect = PreviewView.makeFaceRect()
      faceRect.frame = faceFrame
    }
  }
}


I also tried the following, but they didn't help:


  • Setting the AVCaptureVideoDataOutput's videoSettings to nil.
  • Explicitly setting the CIDetector's orientation to portrait. The phone is in portrait for this test, so it shouldn't matter.
  • Setting and removing CIDetectorTracking: true
  • Setting and removing CIDetectorAccuracy: CIDetectorAccuracyHigh
  • Trying to track only one face, by looking only at the first feature detected.
  • Replacing CVImageBufferGetDisplaySize() with CVImageBufferGetEncodedSize() — they're anyway same, at 1440 x 1080.