Properly projecting points with different orientations and camera positions?

Question

Created Oct ’20

Replies 3

Boosts 0

Views 2.0k

Participants 3

Summary:
I am using the Vision framework, in conjunction with AVFoundation, to detect facial landmarks of each face in the camera feed (by way of the VNDetectFaceLandmarksRequest). From here, I am taking the found observations and unprojecting each point to a SceneKit View (SCNView), then using those points as the vertices to draw a custom geometry that is textured with a material over each found face.

Effectively, I am working to recreate how an ARFaceTrackingConfiguration functions. In general, this task is functioning as expected, but only when my device is using the front camera in landscape right orientation. When I rotate my device, or switch to the rear camera, the unprojected points do not properly align with the found face as they do in landscape right/front camera.

Problem:
When testing this code, the mesh appears properly (that is, appears affixed to a user's face), but again, only when using the front camera in landscape right. While the code runs as expected (that is, generating the face mesh for each found face) in all orientations, the mesh is wildly misaligned in all other cases.

My belief is this issue either stems from my converting the face's bounding box (using VNImageRectForNormalizedRect, which I am calculating using the width/height of my SCNView, not my pixel buffer, which is typically much larger), though all modifications I have tried result in the same issue.

Outside of that, I also believe this could be an issue with my SCNCamera, as I am a bit unsure how the transform/projection matrix works and whether that would be needed here.

Sample of Vision Request Setup:

Code Block // Setup Vision request options
var requestHandlerOptions: [VNImageOption: AnyObject] = [:]
// Setup Camera Intrinsics
let cameraIntrinsicData = CMGetAttachment(sampleBuffer, key: kCMSampleBufferAttachmentKey_CameraIntrinsicMatrix, attachmentModeOut: nil)
if cameraIntrinsicData != nil {
	requestHandlerOptions[VNImageOption.cameraIntrinsics] = cameraIntrinsicData
}
// Set EXIF orientation
let exifOrientation = self.exifOrientationForCurrentDeviceOrientation()
// Setup vision request handler
let handler = VNImageRequestHandler(cvPixelBuffer: pixelBuffer,
									orientation: exifOrientation,
									options: requestHandlerOptions)
// Setup the completion handler
let completion: VNRequestCompletionHandler = {request, error in
	let observations = request.results as! [VNFaceObservation]
	// Draw faces
	DispatchQueue.main.async {
		
		drawFaceGeometry(observations: observations)
		
	}
}
// Setup the image request
let request = VNDetectFaceLandmarksRequest(completionHandler: completion)
// Handle the request
do {
	try handler.perform([request])
} catch {
	print(error)
}

Sample of SCNView Setup:

Code Block // Setup SCNView
let scnView = SCNView()
scnView.translatesAutoresizingMaskIntoConstraints = false
self.view.addSubview(scnView)
scnView.showsStatistics = true
NSLayoutConstraint.activate([
	scnView.leadingAnchor.constraint(equalTo: self.view.leadingAnchor),
	scnView.topAnchor.constraint(equalTo: self.view.topAnchor),
	scnView.bottomAnchor.constraint(equalTo: self.view.bottomAnchor),
	scnView.trailingAnchor.constraint(equalTo: self.view.trailingAnchor)
])
// Setup scene
let scene = SCNScene()
scnView.scene = scene
// Setup camera
let cameraNode = SCNNode()
let camera = SCNCamera()
cameraNode.camera = camera
scnView.scene?.rootNode.addChildNode(cameraNode)
cameraNode.position = SCNVector3(x: 0, y: 0, z: 16)
// Setup light
let ambientLightNode = SCNNode()
ambientLightNode.light = SCNLight()
ambientLightNode.light?.type = SCNLight.LightType.ambient
ambientLightNode.light?.color = UIColor.darkGray
scnView.scene?.rootNode.addChildNode(ambientLightNode)

Sample of "face processing"

Code Block func drawFaceGeometry(observations: [VNFaceObservation]) {
	
	// An array of face nodes, one SCNNode for each detected face
	var faceNode = [SCNNode]()
	
	// The origin point
	let projectedOrigin = sceneView.projectPoint(SCNVector3Zero)
	
	// Iterate through each found face
	for observation in observations {
		
		// Setup a SCNNode for the face
		let face = SCNNode()
		
		// Setup the found bounds
		let faceBounds = VNImageRectForNormalizedRect(observation.boundingBox, Int(self.scnView.bounds.width), Int(self.scnView.bounds.height))
		
		// Verify we have landmarks
		if let landmarks = observation.landmarks {
			// Landmarks are relative to and normalized within face bounds
			let affineTransform = CGAffineTransform(translationX: faceBounds.origin.x, y: faceBounds.origin.y)
				.scaledBy(x: faceBounds.size.width, y: faceBounds.size.height)
			
			// Add all points as vertices
			var vertices = [SCNVector3]()
			
			// Verify we have points
			if let allPoints = landmarks.allPoints {
				
				// Iterate through each point
				for (index, point) in allPoints.normalizedPoints.enumerated() {
					
					// Apply the transform to convert each point to the face's bounding box range
					_ = index
					let normalizedPoint = point.applying(affineTransform)
					let projected = SCNVector3(normalizedPoint.x, normalizedPoint.y, CGFloat(projectedOrigin.z))
					let unprojected = sceneView.unprojectPoint(projected)
					vertices.append(unprojected)
					
				}
			}
			
			// Setup Indices
			var indices = [UInt16]()
			// Add indices
			// ... Removed for brevity ...
			// Setup texture coordinates
			var coordinates = [CGPoint]()
			
			// Add texture coordinates
			// ... Removed for brevity ...
			
			// Setup texture image
			let imageWidth = 2048.0
			let normalizedCoordinates = coordinates.map { coord -> CGPoint in
				let x = coord.x / CGFloat(imageWidth)
				let y = coord.y / CGFloat(imageWidth)
				let textureCoord = CGPoint(x: x, y: y)
				return textureCoord
			}
			// Setup sources
			let sources = SCNGeometrySource(vertices: vertices)
			let textureCoordinates = SCNGeometrySource(textureCoordinates: normalizedCoordinates)
			// Setup elements
			let elements = SCNGeometryElement(indices: indices, primitiveType: .triangles)
			// Setup Geometry
			let geometry = SCNGeometry(sources: [sources, textureCoordinates], elements: [elements])
			geometry.firstMaterial?.diffuse.contents = textureImage
			// Setup node
			let customFace = SCNNode(geometry: geometry)
			sceneView.scene?.rootNode.addChildNode(customFace)
		
		// Append the face to the face nodes array
		faceNode.append(face)
	}
	
	// Iterate the face nodes and append to the scene
	for node in faceNode {
		sceneView.scene?.rootNode.addChildNode(node)
	}
}

Boost

Answer 1

DTS Engineer OP

Apple

Oct ’20

Hey Brandon,

It seems likely that you've made an error somewhere along the line in your transformations. Because a solution is likely going to be specific to your case (and because there are several transforms that need to be examined), I recommend that you file a Technical Support Incident to address this issue.

Thanks!

0

Answer 2

brandonK212 OP

Oct ’20

Hi @gchiste,

Thanks very much for your reply! You are very right; there are many transforms happening and it is certainly a specific use case. I've never used a Technical Support Incident, but I appreciate you commenting on this, as I will certainly take that path and gather the best details possible to provide for help. This seems like a great scenario to make use of such a path for further learning and resolution.

Have a great day!

1

Answer 3

srnlwm OP

Jul ’23

Hey @brandonK212

Did you figure out a solution to this? Im trying to do something similar and running into similar problem. As a debugging step, I tried to draw detected rectangle on RealityKit's ARView. When drawing the detected rectangle as a sublayer of ARView, it seems to move away from the actual face as I slightly turn the device or my head(almost as if, responding to pitch, roll, yaw of the device/face) But the overall dimensions of the rectangle(especially the height of the rectangle) seems to cover the entire face as expected

Im using displayTransform from ARView to adjust the boundingbox here's some code outlining the steps I'm doing:

let viewPortRect = self.arView.frame
let toArViewScaleTx = CGAffineTransform(scaleX: viewPortRect.size.width, y: viewPortRect.size.height)
self.displayTransform = frame.displayTransform(for: interfaceOrientation,
                                                            viewportSize: self.viewPortRect.size).concatenating(toArViewScaleTx)
             
//...

//drawObservations
guard let faceRes = self.facePoseRequest.results?.first as? VNFaceObservation else {
       return
}
let arDisplayRect = faceRes.boundingBox.applying(self.displayTransform)
             
//Draw rect
let path = UIBezierPath(rect: arDisplayRect)
self.bbLayer.path = path.cgPath // self.bbLayer is created and added as a sublayer of arView.layer in viewDidLoad
self.bbLayer.strokeColor = color.cgColor

0