2 Replies
      Latest reply on May 2, 2019 3:30 AM by glenn.jocher
      3DTOPO Level 1 Level 1 (0 points)

        I managed to create a CoreML 2.0 model with flexible input/output shape sizes. I set the input and output sizes from 640...2048 x 640...2048.


        Using coremltools, I did the following:


        import coremltools
        from coremltools.models.neural_network import flexible_shape_utils
        spec = coremltools.utils.load_spec('mymodel_fxedShape.mlmodel')
        img_size_ranges = flexible_shape_utils.NeuralNetworkImageSizeRange()
        img_size_ranges.add_height_range(640, 2048)
        img_size_ranges.add_width_range(640, 2048)
        flexible_shape_utils.update_image_size_range(spec, feature_name='inputImage', size_range=img_size_ranges)
        flexible_shape_utils.update_image_size_range(spec, feature_name='outputImage', size_range=img_size_ranges)
        coremltools.utils.save_spec(spec, 'myModel.mlmodel')


        In Xcode, it shows the range of shapes under the "Flexibility" column. Screen shot: https://3DTOPO.com/modelScreenshot.jpg



        I can't figure out how to set the size in my project, however. If I set the input pixel buffer size 2048x2048, the output pixel buffer is still 1536x1536. If I set it to 768x768, the resulting pixel buffer is still 1536x1536 - but is blank outside the region of 768x768.


        I examined the automatically generated Swift model class and I don't see any clues there.


        I can't find a single example anywhere showing how to use the "Flexibility" sizes. In the WWDC 2018 Session 708 "What's New in Core ML", Part 1 it states:

        This means that now you have to ship a single model. You don't have to have any redundant code. And if you need to switch between standard definition and high definition, you can do it much faster because we don't need to reload the model from scratch; we just need to resize it. You have two options to specify the flexibility of the model. You can define a range for its dimension, so you can define a minimal width and height and the maximum width and height. And then at inference pick any value in between. But there is also another way. You can enumerate all the shapes that you are going to use. For example, all different aspect ratios, all different resolutions, and this is better for performance. Core ML knows more about your use case earlier, so it can -- it has the opportunities of performing more optimizations.

        They say "we just need to resize it". It is extremely frustrating because they don't tell you how to just resize it. They also say "And then at inference pick any value in between" but offer no clue how to pick the value in between. It wouldn't be so frustrating if I could find this documented somewhere, but been up and down the Apple docs, sample code, up and down the CoreML classes, and I just can't figure it out.