CoreML Image Size Identification

Consider if I'm having this Photo For ML

how would i calculate these Measurements like the below image

how would it be possible to calculate these measurement for the json file?

via direct or indirect methods any suggestion?

Answered by in 747232022

Hey thanks for writing! This article on Building an Object Detector Data Source (which may be where you got the image below) is a great resource for this. The key for understanding how to set this up is a few different parameters, each of which you can configure for your use case:

MLBoundingBoxUnits, which can be set as either pixels or normalized. If you choose pixels the coordinates that you'll provide for the bounding boxes are simply the raw number of pixels out of the total width and height. So if you have an image that's 300x300 pixels and you want to specify a point exactly in the middle, the coordinates would be (150, 150). With normalized, it's on a scale from 0-1 instead, so a point in the middle would be (0.5, 0.5) instead.

MLBoundingBoxCoordinatesOrigin, which can be set as either topLeft or bottomLeft. This specifies where you start counting from. The image you've provided starts in the top left, so points along the top have a y-coordinate of 0, while points along the bottom have a y-coordinate of 301 (if pixels is the MLBoundingBoxUnits choice). For bottomLeft you would flip that.

MLBoundingBoxAnchor, which can be set to topLeft, bottomLeft, or center. This specifies what point within a boundingBox you're defining the box from. In this example, center is selected, and so the coordinates that will be reported for this box are the distance from the center of the box to the top left of the image (since the coordinates origin is top left), measured in pixels (since the units are pixels).

Hope that helps!

Accepted Answer

Hey thanks for writing! This article on Building an Object Detector Data Source (which may be where you got the image below) is a great resource for this. The key for understanding how to set this up is a few different parameters, each of which you can configure for your use case:

MLBoundingBoxUnits, which can be set as either pixels or normalized. If you choose pixels the coordinates that you'll provide for the bounding boxes are simply the raw number of pixels out of the total width and height. So if you have an image that's 300x300 pixels and you want to specify a point exactly in the middle, the coordinates would be (150, 150). With normalized, it's on a scale from 0-1 instead, so a point in the middle would be (0.5, 0.5) instead.

MLBoundingBoxCoordinatesOrigin, which can be set as either topLeft or bottomLeft. This specifies where you start counting from. The image you've provided starts in the top left, so points along the top have a y-coordinate of 0, while points along the bottom have a y-coordinate of 301 (if pixels is the MLBoundingBoxUnits choice). For bottomLeft you would flip that.

MLBoundingBoxAnchor, which can be set to topLeft, bottomLeft, or center. This specifies what point within a boundingBox you're defining the box from. In this example, center is selected, and so the coordinates that will be reported for this box are the distance from the center of the box to the top left of the image (since the coordinates origin is top left), measured in pixels (since the units are pixels).

Hope that helps!

Thank you for your valuable replay needed a lot

CoreML Image Size Identification
 
 
Q