I’d like to organize the important points about my question, because I’m misunderstanding about definition of the extrinsic matrix.
The definition of the extrinsic matrix is the transformation from the world coordinate system to the camera coordinate system. So, a point coordinate vector value on the world coordinate system is transformed to the point coordinate vector value on the camera coordinate system by applying extrinsic matrix.
2. With iPhoneX(iOS11.4.1), in dual camera dual photo delivery mode,
telephoto camera(reference camera) extrinsic matrix
[[1.0, 0.0, 0.0)], [0.0, 1.0, 0.0)], [0.0, 0.0, 1.0)], [0.0, 0.0, 0.0)]]
wide-angle camera extrinsic matrix
[[0.999999, 0.0003352, 0.00140538)], [-0.000322289, 0.999958, -0.00917725)], [-0.0014084, 0.00917679, 0.999957)],
[-13.3001, -0.00406908, 0.0237064)]]
Telephoto camera is reference camera. It is a specification of iOS.
So, Telephoto camera’s extrinsic matrix is [identity rotation matrix + origin point vector] (it means telephoto camera’s coordinate system is identical to the World Coordinate system).
On the other hand, what does the wide-angle camera’s extrinsic matrix content indicates?
The wide-angle camera coordinate system’s origin point vector is [13.3001, 0.00406908, -0.0237064] on the world coordinate system. (In other words, the world coordinate system’s origin point vector is [-13.3001, -0.00406908, 0.0237064] on the wide-angle camera coordinate system.)
So, the wide-angle camera (which locates above telephoto camera in iPhoneX) coordinate system’s x-axis direction is almost device bottom to top.
And from here I enter the speculation region. Maybe right-handed convention is introduced. And maybe Z-axis takes the direction of device front to back or back to front. So that means camera coordinate system will be {x-axis: device bottom to top, y-axis: device left to right, z-axis: device front to back} or {x-axis: device bottom to top, y-axis: device right to left, z-axis: device back to front}.
Furthermore, this image data’s orientation value of Exif information is 6(counterclockwise 90 degree). So the maximum likelihood camera coordinate system must be {x-axis: device bottom to top, y-axis: device left to right, z-axis: device front to back}.
3. My question is what camera coordinate system is used in AVCapturePhoto’s cameraCalibrationData?.extrinsicMatrix? I’d like to know if the documentation exist.