Why do we need `rotateToARCamera` in Apple's `Visualizing a Point Cloud Using Scene Depth` sample code?

Sample code: https://developer.apple.com/documentation/arkit/visualizing_a_point_cloud_using_scene_depth
In the code, when unprojecting depthmap into world point, we are using a positive z value(depth value). But in my understanding, ARKit uses right-handed coordinate system which means points with positive z value are behind the camera. So maybe we need to do some extra work to align the coordinate system(using rotateToARCamera matrix?). But I cannot understand why we need to flip both Y and Z plane.

Code Block
static func makeRotateToARCameraMatrix(orientation: UIInterfaceOrientation) -> matrix_float4x4 {
  // flip to ARKit Camera's coordinate
  let flipYZ = matrix_float4x4(
    [1, 0, 0, 0],
    [0, -1, 0, 0],
    [0, 0, -1, 0],
    [0, 0, 0, 1] )
  let rotationAngle = Float(cameraToDisplayRotation(orientation: orientation)) * .degreesToRadian
  return flipYZ * matrix_float4x4(simd_quaternion(rotationAngle, Float3(0, 0, 1)))
}

Why do we need `rotateToARCamera` in Apple's `Visualizing a Point Cloud Using Scene Depth` sample code?
 
 
Q