-
Re: Support for image outputs
FrankSchlegel Jul 2, 2017 2:02 PM (in response to FrankSchlegel)I just tried to get the result into a Metal buffer to convert it in a compute shader, but I'm running into the issue that Metal doesn't support doubles. And CoreML seem to always return doubles in the output (even if I tell it to use FLOAT32 in the spec).
How does CoreML actually do it under the hood when the GPU doesn't support double precision?
-
Re: Support for image outputs
kerfuffle Jul 3, 2017 5:08 AM (in response to FrankSchlegel)If your model outputs an image (i.e. something with width, height, and a depth of 3 or 4 channels), then Core ML can interpret that as an image. You need to pass a parameter for this in the coremltools conversion script, so that Core ML knows this output should be interpreted as an image.
-
Re: Support for image outputs
FrankSchlegel Jul 3, 2017 5:53 AM (in response to kerfuffle)How do I do that? The NeuralNetworkBilder has only a method for pre-processing image inputs, but not for post-processing outputs. If I try to convert the type of the output directly in the spec, the model compiles (and Xcode shows the format correctly), but the result is wrong.
-
Re: Support for image outputs
kerfuffle Jul 3, 2017 7:36 AM (in response to FrankSchlegel)I guess the output only becomes an image if you specify `class_labels` in the call to convert(), but you're not really building a classifier so that wouldn't work. So what I had in mind is not actually a solution to your problem.
This is why I prefer implementing neural networks with MPS. ;-)
-
Re: Support for image outputs
michael_s Jul 3, 2017 11:07 AM (in response to FrankSchlegel)While the NeuralNetworkBuilder currently does not have options for image outputs, you can use coremtools to modify the resulting model so that the desired multiarray output is treated as an image.
Here is an example helper function:
def convert_multiarray_output_to_image(spec, feature_name, is_bgr=False): """ Convert an output multiarray to be represented as an image This will modify the Model_pb spec passed in. Example: model = coremltools.models.MLModel('MyNeuralNetwork.mlmodel') spec = model.get_spec() convert_multiarray_output_to_image(spec,'imageOutput',is_bgr=False) newModel = coremltools.models.MLModel(spec) newModel.save('MyNeuralNetworkWithImageOutput.mlmodel') Parameters ---------- spec: Model_pb The specification containing the output feature to convert feature_name: str The name of the multiarray output feature you want to convert is_bgr: boolean If multiarray has 3 channels, set to True for RGB pixel order or false for BGR """ for output in spec.description.output: if output.name != feature_name: continue if output.type.WhichOneof('Type') != 'multiArrayType': raise ValueError("%s is not a multiarray type" % output.name) array_shape = tuple(output.type.multiArrayType.shape) channels, height, width = array_shape from coremltools.proto import FeatureTypes_pb2 as ft if channels == 1: output.type.imageType.colorSpace = ft.ImageFeatureType.ColorSpace.Value('GRAYSCALE') elif channels == 3: if is_bgr: output.type.imageType.colorSpace = ft.ImageFeatureType.ColorSpace.Value('BGR') else: output.type.imageType.colorSpace = ft.ImageFeatureType.ColorSpace.Value('RGB') else: raise ValueError("Channel Value %d not supported for image inputs" % channels) output.type.imageType.width = width output.type.imageType.height = height
Note: Neural Networks can output images from a layer (as CVPixelBuffer), but it clamps the values between 0 and 255. i.e. values < 0 become 0, values > 255 become 255.
You can also just keep the output an MLMultiArray and index into pixels with something like this in swfit:
let imageData = UnsafeMutablePointer<Double>(OpaquePointer(imageMultiArray.dataPointer)) let channelStride = imageMultiArray.strides[0].intValue; let yStride = imageMultiArray.strides[1].intValue; let xStride = imageMultiArray.strides[2].intValue; func pixelOffset(_ channel: Int, _ y: Int, _ x: Int) -> Int { return channel*channelStride + y*yStride + x*xStride } let topLeftGreenPixel = Unit8(imageData[pixelOffset(1,0,0)])
-
Re: Support for image outputs
FrankSchlegel Jul 3, 2017 1:49 PM (in response to michael_s)I actually tried pretty much exactly that (tagging the output as image). But my problem is that I can't seem to get the CVPixelBuffer back into some displayable format. I gonna keep on trying.
I still don't really understand how CoreML can give me doubles when it's doing its computation on the GPU, though...
-
Re: Support for image outputs
michael_s Jul 3, 2017 3:27 PM (in response to FrankSchlegel)What displayable format are you looking for? Here are some potentially useful methods for converting a CVPixelBuffer output into other representations:
Construct CIImage from CVPixelBuffer:
https://developer.apple.com/documentation/coreimage/ciimage/1438072-init
Construct UIImage from CIImage:
https://developer.apple.com/documentation/uikit/uiimage/1624114-init
Construct a CV Metal Texture from existing CoreVideo buffer:
https://developer.apple.com/documentation/corevideo/1456754-cvmetaltexturecachecreatetexture
https://developer.apple.com/documentation/corevideo/1456868-cvmetaltexturegettexture
-
Re: Support for image outputs
FrankSchlegel Jul 3, 2017 11:26 PM (in response to michael_s)Thank's for your help, Michael.
It acutally works! As it turns out I kinda wasn't trying hard enough. I had the CIImage approach, but the image simply didn't show—neither when inspecting with Quick Look in Xcode nor when displaying in a view. Now I checked the memory and as it turnes out the alpha channel of the result is 0. So if I render it into a new context with CGImageAlphaInfo.noneSkipLast it works.
I can work with that, thanks!
-
Re: Support for image outputs
BrianOn99 Jul 10, 2017 1:12 AM (in response to michael_s)HI Michael, could the team responsible for coreml compiler set the alpha channel to 255 by default? Though it is not very inconvenience to set it ourself, I think it will confuse more people as time goes by.
Could it be considered a bug?
-
-
-
Re: Support for image outputs
BrianOn99 Jul 7, 2017 2:09 AM (in response to michael_s)To use the helper you provided, "convert_multiarray_output_to_image", does the model output dataType need to be DOUBLE or INT32, or both?
-
Re: Support for image outputs
FrankSchlegel Jul 7, 2017 6:10 AM (in response to BrianOn99)There is no requirement for the type, I think. The has to be in the shape (channels, height, width) and have either 1 or 3 channels. I guess the internal conversion step will handle the rest.
-
Re: Support for image outputs
BrianOn99 Jul 9, 2017 7:22 PM (in response to FrankSchlegel)Thanks a lot. I will try this out.
-
Re: Support for image outputs
BrianOn99 Jul 9, 2017 11:10 PM (in response to FrankSchlegel)Edit:
I have solved the problem by taking another approach. I take a CVPixelBufferGetBaseAddress(myCVPixelBuffer), and then directly set the alpha chennel to 255 to "remove" the alpha channel. This approach might be a bit raw as it use UnsafeRawPointer and assume the alpha pixels are located at 3, 7, 11, ... But at least it works.
[Old message]
I think I have got the same problem of getting a "not displayable format", in other word, the image does not show in quick look. I can already get the image when MLMutiArray is returned, but not when converted CVPixelBuffer in the mlmodel. Sorry I am quite new to ios development so I don't understand how to use "CGImageAlphaInfo.noneSkipLast". I guessed that I need to do the conversion cvPixelBuffer -> CIImage -> CGImage, and then create a CGContext with CGImageAlphaInfo.noneSkipLast, then draw the CGImage to the CGContext, finally get a CGImage from the CGContext.
But somehow the image becomes black. Is the Is this the correct approach? Could you please share some steps of your approach?
Thanks for your many helps.
-
Re: Support for image outputs
FrankSchlegel Jul 10, 2017 12:06 AM (in response to BrianOn99)For me this works without changing the pixel buffer (output is your CVPixelBuffer):
CVPixelBufferLockBaseAddress(output, .readOnly) let width = CVPixelBufferGetWidth(output) let height = CVPixelBufferGetHeight(output) let data = CVPixelBufferGetBaseAddress(output)! let outContext = CGContext(data: data, width: width, height: height, bitsPerComponent: 8, bytesPerRow: CVPixelBufferGetBytesPerRow(output), space: CGColorSpaceCreateDeviceRGB(), bitmapInfo: CGImageAlphaInfo.noneSkipLast.rawValue)! let outImage = outContext.makeImage()! CVPixelBufferUnlockBaseAddress(output, .readOnly)
-
-
-
-
-
Re: Support for image outputs
lozanoleonardo Jul 9, 2017 11:03 PM (in response to michael_s)Hey Michael, can I use this script to conver my MLMultiArray input to be an Image instead? I'm guessing that it needs to consider
spec.description.input
in this case.
-
Re: Support for image outputs
BrianOn99 Jul 9, 2017 11:23 PM (in response to lozanoleonardo)Just for your reference, if you network orginal output shape is [3, 511, 511] then after conversion to CVPixelBuffer as output, the diff is only:
--- oil_mlmodel_yespost.pb 2017-07-10 11:00:21.078301960 +0800 +++ oil_mlmodel_yespost_cvbuffer.pb 2017-07-10 10:59:38.374233180 +0800 @@ -13,11 +13,10 @@ output { name: "output" type { - multiArrayType { - shape: 3 - shape: 511 - shape: 511 - dataType: DOUBLE + imageType { + width: 511 + height: 511 + colorSpace: RGB } } }
And in my case I just need to run convert_multiarray_output_to_image(tm, 'output'), where tm is my model. I don't need to specify the input.
-
Re: Support for image outputs
lozanoleonardo Jul 10, 2017 10:38 PM (in response to BrianOn99)I guess my question is about converting the input type MLMultiArray to Image instead of the output.
-
Re: Support for image outputs
BrianOn99 Jul 12, 2017 3:48 AM (in response to lozanoleonardo)Sorry I misunderstood that.
Conversion of input to image is usually achieved by supplying the parameter "image_input_names" in the coreml conversion function. Why didn't you take that approach?
-
Re: Support for image outputs
lozanoleonardo Jul 14, 2017 2:40 PM (in response to BrianOn99)Oh, I didn't take that approach becuase a couple of user in stackoverflow reported that this approach didn't work and provided the link to this thread. I guess it's my fault that didn't try before asking.
Thanks again.
-
-
-
-
-
Re: Support for image outputs
fakrueger Aug 23, 2017 8:13 PM (in response to michael_s)Thanks for those details, it really helped! One question though:
How would you apply scale and biases to the image data before conversion? For instance, my network outputs in the range [-1, 1] and I need to convert that to [0, 255].
When using the Keras converter, the input image can thankfully be scaled and biased. Can that code be reused somehow?
-
Re: Support for image outputs
BrianOn99 Aug 24, 2017 2:21 AM (in response to fakrueger)I have tackled a similar problem (subtract VGG constants) of post processing by manually inserting some 1x1 convolution. For your particular problem, you may try adding a 1x1x3x3 (1by1 kernel, 3 for image channel) conv layer of weight
[127.5, 0, 0
0, 127.5, 0
0, 0, 127.5]
and bias
[127.5, 127.5, 127.5]
Place it after you model's final layer.
This operation will scale each channel seperately into [-127.5, 127.5] and then add 127.5 into each channel. I have not tried it, just to give you a direction to work on.
As an aside, resetting alpha channel is no longer required at xcode beta 5.
-
Re: Support for image outputs
FrankSchlegel Aug 24, 2017 2:47 AM (in response to BrianOn99)There's actually a bias layer in the spec that does exactly that. No need for the work-around. Here is my helper for the NeuralNetworkBuilder:
def _add_bias(self, name, bias, shape, input_name, output_name): """ Add bias layer to the model. Parameters ---------- name: str The name of this layer. bias: numpy.array The biases to apply to the inputs. The size must be equal to the product of the ``shape`` dimensions. shape: tuple The shape of the bias. Must be one of the following: ``[1]``, ``[C]``, ``[1, H, W]`` or ``[C, H, W]``. input_name: str The input blob name of this layer. output_name: str The output blob name of this layer. """ nn_spec = self.nn_spec # Add a new bias layer spec_layer = nn_spec.layers.add() spec_layer.name = name spec_layer.input.append(input_name) spec_layer.output.append(output_name) spec_layer_params = spec_layer.bias spec_layer_params.bias.floatValue.extend(map(float, bias.flatten())) spec_layer_params.shape.extend(shape) _NeuralNetworkBuilder.add_bias = _add_bias
-
Re: Support for image outputs
adib Dec 27, 2018 1:29 AM (in response to FrankSchlegel)How to incorporate the bias layer to the conversion process?
-
-
-
-
Re: Support for image outputs
ktak199 Oct 14, 2017 1:47 AM (in response to michael_s)I'm trying to convert outputs of mlmodel to UIImage, but it's not working...
outputs of mlmodel : Image(Grayscale width x height)
guard let results = request.results as? [VNPixelBufferObservation] else{ fatalError("Fatal error")} print(String(describing: type(of: results)) print(String(describing: type(of: results[0]))) let ciImage = CIImage(cvPixelBuffer: results[0].pixelBuffer)
Outputs:
Array<VNPixelBufferObservation>
VNPixelBufferObservation
Error occurs on line 04 :
Thread 1: EXC_BAD_ACCESS (code=1, address=0xe136dbec8)
-------------------------------------------------------------------------------------------
I'm trying to keep mlmodel MultiArray.
outputs of mlmodel : MultiArray (Double 1 x width x height)
guard let results = request.results as? [VNCoreMLFeatureValueObservation] else{ fatalError("Fatal error")} print(String(describing: type(of: results)) print(String(describing: type(of: results[0])) print(String(describing: type(of: results[0].featureValue))) print(results[0].featureValue) print(results[0].featureValue.multiArrayValue) let imageMultiArray = results.[0].featureValue.multiArrayValue let imageData = UnsafeMutablePointer<Double>(OpaquePointer(imageMultiArray?.dataPointer)) let channelStride = imageMultiArray?.strides[0].intValue; let yStride = imageMultiArray?.strides[1].intValue; let xStride = imageMultiArray?.strides[2].intValue; func pixelOffset(_ channel: Int, _ y: Int, _ x: Int) -> Int { return channel*channelStride! + y*yStride! + x*xStride! } let topLeftGreenPixel = Unit8(imageData![pixelOffset(1,0,0)])
Outputs:
Array<VNCoreMLFeatureValueObservation>
VNCoreMLFeatureValueObservation
Optional<MLFeatureValue>
Optional(MultiArray : Double 1 x width x height array)
Optional(Double 1 x width x height array)
Error occurs on line 19 :Use of unresolved identifier 'Unit8
shoud be replaced by Uint8 ? and how to convert to UIImage?
Thank you for your any help in advance!
-
Re: Support for image outputs
FrankSchlegel Oct 16, 2017 11:38 PM (in response to ktak199)Yes, Unit8 is a typo and should be Uint8.
But regardless, you should really try to adjust the spec of your model to produce image outputs (instead of an multi-array). This way you should be able to get the pixel buffer directly from the prediction. In the answer above you can see the Python function that can do that for you.
-
Re: Support for image outputs
ktak199 Oct 17, 2017 5:53 AM (in response to FrankSchlegel)Thank you for your response.
The output of my mlmodel is already image outputs (Grayscale width x height).
Error does not occur in getting outputs on line1.
But error occurs when I try to access to pixelBuffer on line 04
Error message: Thread 1: EXC_BAD_ACCESS (code=1, address=0xe136dbec8)
guard let results = request.results as? [VNPixelBufferObservation] else{ fatalError("Fatal error")} print(String(describing: type(of: results)) //->Array<VNPixelBufferObservation> print(String(describing: type(of: results[0]))) //->VNPixelBufferObservation let ciImage = CIImage(cvPixelBuffer: results[0].pixelBuffer)
Thank you.
-
Re: Support for image outputs
FrankSchlegel Oct 17, 2017 5:56 AM (in response to ktak199)Ok, that's indeed strange. Do you get any console output?
Also maybe try to use your model with CoreML directly, without Vision. Do you get the same error there?
-
Re: Support for image outputs
ktak199 Oct 17, 2017 6:51 AM (in response to FrankSchlegel)print(String(describing: type(of: results)) print(String(describing: type(of: results[0]))) let ciImage = CIImage(cvPixelBuffer: results[0].pixelBuffer)
Output is
Array<VNPixelBufferObservation>
VNPixelBufferObservation
Thread 1: EXC_BAD_ACCESS (code=1, address=0xe136dbec8)
How to use CoreML directly, without Vision? Any URL?
-
Re: Support for image outputs
FrankSchlegel Oct 17, 2017 11:17 PM (in response to ktak199)It's a bit tedious because you have to do the pixel buffer conversion yourself. Check out the repo hollance/CoreMLHelpers on Github so see how it's done (sorry for no link, but wanted to avoid waiting for moderation).
-
Re: Support for image outputs
ktak199 Oct 17, 2017 9:51 PM (in response to FrankSchlegel)It worked! Thank you.
-
Re: Support for image outputs
FrankSchlegel Oct 17, 2017 11:19 PM (in response to ktak199)Glad to hear that!
It's a bit curious that it's not working with Vision, though. Maybe it's because your output is grey-scale and Vision expects RGB?
-
Re: Support for image outputs
OliDem Nov 5, 2017 1:40 PM (in response to FrankSchlegel)Dear Frank and ktak199, I think I am facing the same issue.
I converted my torch-model for style transfer to coreml using the torch2coreml converter.
As soon as I access the pixelBuffer property of the VNPixelBufferObservation in the completionHandler of the VNCoreMLRequest, the program crashes with EXC_BAD_ACCESS.
Can you confirm that this problem is caused by using the vision framework, and not by the model conversion procedure?
So, if I use „plain“ CoreML, chances are high that the model will work?
Thanks a lot in advance
Oliver
-
Re: Support for image outputs
FrankSchlegel Nov 8, 2018 1:02 AM (in response to OliDem)Hey Oliver,
Sorry for the late response.
While I can't confirm that this is definitely an issue with the Vision framework, I would at least recommend you give the manual approach a try. It's not that hard and it gives you much more control over the conversion. I don't know why Vision can't handle your model output, though.
-
-
-
-
-
-
-
-
-
-
Re: Support for image outputs
tyro tyro Jun 17, 2018 12:04 AM (in response to michael_s)Unlike others who have used this(forcing the model to output an image) method and gotten back CVPixelBuffers with alpha 0, I am getting back an alpha of 255 and r,g,b in [0,1]. Ignoring the alpha channel, PIL displays an image that is definitely related to the desired output. Could it have something to do with the image scale/rgb biases when i convert the model from keras? I am not really sure where to go with this, we're trying to convert the CVPixelBuffer output into a MetalTexture and want to avoid extra post-processing steps.
For clarity: I am converting from keras to coreML, and then calling "convert_multiarray_output_to_image(...)" on the coreML model. I have an image scale of 1/127.5 and rgb biases of -1 at the keras conversion step.
-
Re: Support for image outputs
adib Apr 25, 2019 10:23 PM (in response to tyro tyro)You probably need to alter the resulting Core ML model using the coremltools library (Python side). In short, add a bias layer at the end which performs the final linear transformation that you need.
I’ve written it out in detail here: https://cutecoder.org/programming/core-ml-image-output/
That took me a few weeks to figure out, hence I made a post that’s hopefully useful (and hopefully the functionality can be built into coremltools itself).
-
-
-
-
-
Re: Support for image outputs
_newcoder_ Feb 7, 2018 5:00 AM (in response to FrankSchlegel)array_shape = tuple(output.type.multiArrayType.shape)
This is returning an enpty tuple. I checked the shape of my model on xcode and the dimension is 1*1*2*224*224 which i guess corresponds to channel=2,height=224,width=224(no idea about other two dimensions). So my question is why empty tuple is being returned ?
And also i want to know what does channel value 2 represent??? The output of my model was suppossed to be grayscale , so value should have been 1 i guess.
Thanks in advance!!!
-
Re: Support for image outputs
FrankSchlegel Feb 7, 2018 5:21 AM (in response to _newcoder_)Hmm, it seems there is something off with your model spec. Can you maybe print output and post it here? It should have a 3-dimensional shape and only one channel if it's a grayscale image.
The first two dimensions you see are used for internal batch processing and should actually not be exposed in the output of the model.
-
-
Re: Support for image outputs
bruce25796 Nov 7, 2018 12:48 AM (in response to FrankSchlegel)Dear Developers:
I transform a Keras model of input a gray single-channel image and output a gray single-channel image.
In using a "coremltools", coremltools.converters.keras.convert , setting
----------------------
coreml_model = coremltools.converters.keras.convert(model, input_names = 'data',
image_input_names='data',
output_names='outputImage',
image_scale= 1/255.0)
------------------------
coreml_model only sets Inputs data: Image(Grayscale 256x256)
Outputs outputImage:MutliArray( Double 1x256x256)
if writing codes below,
--------------------------
def convert_multiarray_output_to_image(spec, feature_name, is_bgr=False):
for output in spec.description.output:
if output.name != feature_name:
continue
if output.type.WhichOneof('Type') != 'multiArrayType':
raise ValueError("%s is not a multiarray type" % output.name)
array_shape = tuple(output.type.multiArrayType.shape)
channels, height, width = array_shape
from coremltools.proto import FeatureTypes_pb2 as ft
if channels == 1:
output.type.imageType.colorSpace = ft.ImageFeatureType.ColorSpace.Value('GRAYSCALE')
elif channels == 3:
spec = coreml_model.get_spec()
convert_multiarray_output_to_image( spec, 'outputImage', is_bgr=False)
newModel = coremltools.models.MLModel(spec)
------------------------
coreml_model only sets Inputs inputImage: MutliArray( Double 1x256x256 )
Outputs outputImage:Image( Grayscale 256x256)
Question:
Is there any way to output coreml_model as below ?
Inputs inputImage: image( Grayscale 256x256 )
Outputs outputImage:Image( Grayscale 256x256)
Thanks
-
Re: Support for image outputs
FrankSchlegel Nov 8, 2018 11:56 PM (in response to bruce25796)Hi Bruce,
You also need to convert the input into an image. You can do that using the same method as for outputs. Just replace all instances of "output" with "input" in your convert_multiarray_output_to_image method and you got yourself a convert_multiarray_input_to_image method. Then you just need to apply that to the spec as well before creating your model.
-