converting between Vision coordinates and pixels?

I'm working with some data scientists on a coremltools model. I've previously worked with Apple's OCR model, and it can be passed a regionOfInterest with Vision coordinates (normalized, lower-left origin) and return coordinates that are normalized across the entire image frame.

It looks like coremltools is passed an image in pixel coordinates. Is there any data available on the Python side to convert the results back into normalized coordinates for the entire image frame? Or is the most they would be able to do is normalize the results across the regionOfInterest?