Multiple Image Recognition with the same marker

Question

Hello,

I am working on an AR Application written in Swift and I am faced on a use case that ARKit does not allow: the multiple image recognitions with the same marker.

Apple has already warned the users:

https://developer.apple.com/documentation/arkit/recognizing_images_in_an_ar_experience#2958517

Consider when to allow detection of each image to trigger (or repeat) AR interactions. ARKit adds an image anchor to a session exactly once for each reference image in the session configuration’s

detectionImages
array. If your AR experience adds virtual content to the scene when an image is detected, that action will by default happen only once. To allow the user to experience that content again without restarting your app, call the session’s remove(anchor:)
method to remove the corresponding ARImageAnchor
. After the anchor is removed, ARKit will add a new anchor the next time it detects the image.

For example, in the case described above, where spaceships appear to fly out of a movie poster, you might not want an extra copy of that animation to appear while the first one is still playing. Wait until the animation ends to remove the anchor, so that the user can trigger it again by pointing their device at the image.

I would like to know why Apple has forgotten this use case? Maybe it will be implemented for a new version of ARKit.

So I have done a workaround of that case, I would like to know your opinion:

First, Apple has mentioned that to reactivate the image detection we should remove the anchor from the session. If we do that on the

func renderer(SCNSceneRenderer, didAdd: SCNNode, for: ARAnchor)

delegate function we will loop indefinitely.

So I created a temporary list of ARImageAnchor to compare with the new anchor. To compare the anchor, I calculate the intersection over union of two anchors projected on 2D plane. (Intersection area / union area).

Below my source code:

/// Temporary array of `ARImageAnchor` allowing to save
/// images detected (for multi-detection)
internal var tmpImageAnchorDetected: [ARImageDetected] = []

struct ARImageDetected {
    var anchor: ARImageAnchor
    var alignment: ARPlaneAnchor.Alignment
   
    init(anchor a: ARImageAnchor, alignment: ARPlaneAnchor.Alignment) {
        self.anchor = a
        self.alignment = alignment
    }
}

extension CGRect {
   var area: CGFloat {
        return height * width
    }

    func intersectOverUniton(rec: CGRect) -> CGFloat {
        return rec.intersection(self).area / self.union(rec).area
    }
}

    /// Decides if the anchor passed in parameter should be detected by ARKit.
    ///
    /// - Parameter anchor: An `ARImageAnchor` instance.
    func shouldDetected(anchor: ARImageAnchor) -> Bool {
        guard let focusAlignment = self.focus.recentFocusAlignments.last,
            let sourceProjection = projection(from: anchor,
                                              alignment: focusAlignment) else { return false }

        let tests = tmpImageAnchorDetected.compactMap { (aria) -> CGFloat? in
            guard let tp = projection(from: aria.anchor,
                                      alignment: aria.alignment) else { return nil }
            print("IOU: \(sourceProjection.intersectOverUniton(rec: tp))")
            return sourceProjection.intersectOverUniton(rec: tp)
        }
        /
        /
        return !(tests.filter({$0 > 0.5}).count > 0)
    }

    /// Returns the projection of an `ARImageAnchor` from the 3D world space
    /// detected by ARKit into the 2D space of a view rendering the scene.
   ///
    /// - Parameter from: An Anchor instance for projecting.
    /// - Parameter alignment: Plane Alignment of the image detected.
    /// - Returns: An optional `CGRect` corresponding on `ARImageAnchor` projection.
    internal func projection(from anchor: ARImageAnchor,
                             alignment: ARPlaneAnchor.Alignment,
                             debug: Bool = false) -> CGRect? {
        guard let camera = session.currentFrame?.camera else {
            return nil
        }

        let refImg = anchor.referenceImage
        let anchor3DPoint = anchor.transform.columns.3

        let size = view.bounds.size
        let width = Float(refImg.physicalSize.width / 2)
        let height = Float(refImg.physicalSize.height / 2)


        let projection = ProjectionHelper.projection(from: anchor3DPoint,
                                                     width: width,
                                                     height: height,
                                                     focusAlignment: alignment)
        let topLeft = projection.0
        let topLeftProjected = camera.projectPoint(topLeft,
                                          orientation: .portrait,
                                          viewportSize: size)

        let topRight:simd_float3 = projection.1
        let topRightProjected = camera.projectPoint(topRight,
                                           orientation: .portrait,
                                           viewportSize: size)

        let bottomLeft = projection.2
        let bottomLeftProjected = camera.projectPoint(bottomLeft,
                                             orientation: .portrait,
                                             viewportSize: size)

        let bottomRight = projection.3
        let bottomRightProjected = camera.projectPoint(bottomRight,
                                              orientation: .portrait,
                                              viewportSize: size)

        let result = CGRect(origin: topLeftProjected,
                            size: CGSize(width: topRightProjected.distance(point: topLeftProjected),
                                         height: bottomRightProjected.distance(point: bottomLeftProjected)))

        if debug {
            DispatchQueue.main.async { [unowned self] in
                self.createIndicator(position: topLeft)
                self.createIndicator(position: topRight)
                self.createIndicator(position: bottomLeft)
                self.createIndicator(position: bottomRight)
  
                let dd = Draw(frame: result)
                self.view.addSubview(dd)
            }
        }

        return result
    }

class ProjectionHelper {
    static func projection(from vector: simd_float4,
                           width: Float,
                           height: Float,
                           focusAlignment: ARPlaneAnchor.Alignment)
        -> (vector_float3, vector_float3, vector_float3, vector_float3) {
            switch focusAlignment {
            case .horizontal:
                return (
                    vector_float3([vector[0] - width, vector[1], vector[2] - height]),
                    vector_float3([vector[0] + width, vector[1], vector[2] - height]),
                    vector_float3([vector[0] - width, vector[1], vector[2] + height]),
                    vector_float3([vector[0] + width, vector[1], vector[2] + height])
                )
            case .vertical:
                return (
                    vector_float3([vector[0] - width, vector[1] + height, vector[2]]),
                    vector_float3([vector[0] + width, vector[1] + height, vector[2]]),
                    vector_float3([vector[0] + width, vector[1] - height, vector[2]]),
                    vector_float3([vector[0] - width, vector[1] - height, vector[2]])
                )
            }
    }
}

Below, how to use my implementation:

func KZSARDetected(_ renderer: SCNSceneRenderer, didAdd node: SCNNode, image anchor: ARImageAnchor) {
        DispatchQueue.main.async { [unowned self] in
            self.session.remove(anchor: anchor)
            guard let alignment = self.focus.recentFocusAlignments.last else {
                return
            }
            if self.shouldDetected(anchor: anchor) {
                self.showsCatalogView(node: node)
                self.tmpImageAnchorDetected.append(ARImageDetected(anchor: anchor, alignment: alignment))
            }
        }
}

Thanks

Ysée

Kaizen-Solutions Lab

ARKit

2.3k

Posted by

Ysee-kzsln

Reply

Add a Comment

Answer 1

I found a solution to get corner 3D points of an `ARImageAnchor` depending on the `anchor.transform` and project them to 2D space:

        /// Returns the projection of an `ARImageAnchor` from the 3D world space
        /// detected by ARKit into the 2D space of a view rendering the scene.
        ///
        /// - Parameter from: An Anchor instance for projecting.
        /// - Returns: An optional `CGRect` corresponding on `ARImageAnchor` projection.
        func projection(from anchor: ARImageAnchor) -> CGRect? {
            guard let camera = session.currentFrame?.camera else {
                return nil
            }
         
            let refImg = anchor.referenceImage
            let transform = anchor.transform.transpose

         
            let size = view.bounds.size
            let width = Float(refImg.physicalSize.width / 2)
            let height = Float(refImg.physicalSize.height / 2)
         
            // Get corner 3D points
            let pointsWorldSpace = [
                matrix_multiply(simd_float4([width, 0, -height, 1]), transform).vector_float3, // top right
                matrix_multiply(simd_float4([width, 0, height, 1]), transform).vector_float3, // bottom right
                matrix_multiply(simd_float4([-width, 0, -height, 1]), transform).vector_float3, // bottom left
                matrix_multiply(simd_float4([-width, 0, height, 1]), transform).vector_float3 // top left
            ]
         
            // Project 3D point to 2D space
            let pointsViewportSpace = pointsWorldSpace.map { (point) -> CGPoint in
                return camera.projectPoint(point,
                                    orientation: .portrait,
                                    viewportSize: size)
            }
         
            // Create a rectangle shape of the projection
            // to calculate the Intersection Over Union of other `ARImageAnchor`
            let result = CGRect(origin: pointsViewportSpace[3],
                                size: CGSize(width: pointsViewportSpace[0].distance(point: pointsViewportSpace[3]),
                                             height: pointsViewportSpace[1].distance(point: pointsViewportSpace[2])))
         
         
            return result
        }

See my stackoverflow issue: https://stackoverflow.com/questions/49861366/arkit-projection-of-aranchor-to-2d-space/49977843#49977843

Posted by

Ysee-kzsln

Add a Comment

Answer 2

I found a solution to get corner 3D points of an `ARImageAnchor` depending on the `anchor.transform` and project them to 2D space:

        /// Returns the projection of an `ARImageAnchor` from the 3D world space
        /// detected by ARKit into the 2D space of a view rendering the scene.
        ///
        /// - Parameter from: An Anchor instance for projecting.
        /// - Returns: An optional `CGRect` corresponding on `ARImageAnchor` projection.
        func projection(from anchor: ARImageAnchor) -> CGRect? {
            guard let camera = session.currentFrame?.camera else {
                return nil
            }
         
            let refImg = anchor.referenceImage
            let transform = anchor.transform.transpose

         
            let size = view.bounds.size
            let width = Float(refImg.physicalSize.width / 2)
            let height = Float(refImg.physicalSize.height / 2)
         
            // Get corner 3D points
            let pointsWorldSpace = [
                matrix_multiply(simd_float4([width, 0, -height, 1]), transform).vector_float3, // top right
                matrix_multiply(simd_float4([width, 0, height, 1]), transform).vector_float3, // bottom right
                matrix_multiply(simd_float4([-width, 0, -height, 1]), transform).vector_float3, // bottom left
                matrix_multiply(simd_float4([-width, 0, height, 1]), transform).vector_float3 // top left
            ]
         
            // Project 3D point to 2D space
            let pointsViewportSpace = pointsWorldSpace.map { (point) -> CGPoint in
                return camera.projectPoint(point,
                                    orientation: .portrait,
                                    viewportSize: size)
            }
         
            // Create a rectangle shape of the projection
            // to calculate the Intersection Over Union of other `ARImageAnchor`
            let result = CGRect(origin: pointsViewportSpace[3],
                                size: CGSize(width: pointsViewportSpace[0].distance(point: pointsViewportSpace[3]),
                                             height: pointsViewportSpace[1].distance(point: pointsViewportSpace[2])))
         
         
            return result
        }

See my stackoverflow issue: https://stackoverflow.com/questions/49861366/arkit-projection-of-aranchor-to-2d-space/49977843#49977843

Posted by

Ysee-kzsln

Add a Comment

Multiple Image Recognition with the same marker

Accepted Reply

Replies