Converting OCR Keras model correctly

Hi,


I have modified the sample OCR model on the Keras Github page (https://github.com/fchollet/keras/blob/master/examples/image_ocr.py) to detect numbers.

I am able to successfuly convert the model into the CoreML format. However, some of the layers are recognized as input and output layers, but arent really ones.

The model itself seems to be fine. However, the model does not recognize numbers once it is converted. I have set the image_scale to 1/255.0 as the input seems to be grayscale with values from 0 to 1. This is visible in the example beginning at line 78.


I am not sure if I did anything wrong during conversion.

Maybe those GRU layers cause some issues?


Here is the model:


____________________________________________________________________________________________________
Layer (type)                     Output Shape          Param #     Connected to                    
====================================================================================================
the_input (InputLayer)           (None, 35, 35, 1)     0                                           
____________________________________________________________________________________________________
conv1 (Conv2D)                   (None, 35, 35, 16)    160         the_input[0][0]                 
____________________________________________________________________________________________________
max1 (MaxPooling2D)              (None, 17, 17, 16)    0           conv1[0][0]                     
____________________________________________________________________________________________________
conv2 (Conv2D)                   (None, 17, 17, 16)    2320        max1[0][0]                      
____________________________________________________________________________________________________
max2 (MaxPooling2D)              (None, 8, 8, 16)      0           conv2[0][0]                     
____________________________________________________________________________________________________
reshape (Reshape)                (None, 8, 128)        0           max2[0][0]                      
____________________________________________________________________________________________________
dense1 (Dense)                   (None, 8, 32)         4128        reshape[0][0]                   
____________________________________________________________________________________________________
gru1 (GRU)                       (None, 8, 512)        837120      dense1[0][0]                    
____________________________________________________________________________________________________
gru1_b (GRU)                     (None, 8, 512)        837120      dense1[0][0]                    
____________________________________________________________________________________________________
add_1 (Add)                      (None, 8, 512)        0           gru1[0][0]                      
                                                                   gru1_b[0][0]                    
____________________________________________________________________________________________________
gru2 (GRU)                       (None, 8, 512)        1574400     add_1[0][0]                     
____________________________________________________________________________________________________
gru2_b (GRU)                     (None, 8, 512)        1574400     add_1[0][0]                     
____________________________________________________________________________________________________
reshape2 (Reshape)               (None, 512, 8)        0           gru2[0][0]                      
____________________________________________________________________________________________________
reshape2b (Reshape)              (None, 512, 8)        0           gru2_b[0][0]                    
____________________________________________________________________________________________________
concatenate_1 (Concatenate)      (None, 1024, 8)       0           reshape2[0][0]                  
                                                                   reshape2b[0][0]                 
____________________________________________________________________________________________________
reshape3 (Reshape)               (None, 8, 1024)       0           concatenate_1[0][0]             
____________________________________________________________________________________________________
dense2 (Dense)                   (None, 8, 12)         12300       reshape3[0][0]                  
____________________________________________________________________________________________________
softmax (Activation)             (None, 8, 12)         0           dense2[0][0]                    
====================================================================================================


And what coremltools sees:


0 : the_input, <keras.engine.topology.InputLayer object at 0x1147c6510>
1 : conv1, <keras.layers.convolutional.Conv2D object at 0x1147c67d0>
2 : conv1__activation__, <keras.layers.core.Activation object at 0x123a6a1d0>
3 : max1, <keras.layers.pooling.MaxPooling2D object at 0x114823a90>
4 : conv2, <keras.layers.convolutional.Conv2D object at 0x114823e50>
5 : conv2__activation__, <keras.layers.core.Activation object at 0x123a6a690>
6 : max2, <keras.layers.pooling.MaxPooling2D object at 0x114823ad0>
7 : reshape, <keras.layers.core.Reshape object at 0x1147c6a50>
8 : dense1, <keras.layers.core.Dense object at 0x11485de10>
9 : dense1__activation__, <keras.layers.core.Activation object at 0x123a6a790>
10 : gru1, <keras.layers.recurrent.GRU object at 0x11484db50>
11 : gru1_b, <keras.layers.recurrent.GRU object at 0x11630a610>
12 : add_1, <keras.layers.merge.Add object at 0x115f9e090>
13 : gru2, <keras.layers.recurrent.GRU object at 0x11f5ead50>
14 : gru2_b, <keras.layers.recurrent.GRU object at 0x116161210>
15 : reshape2, <keras.layers.core.Reshape object at 0x11f475cd0>
16 : reshape2b, <keras.layers.core.Reshape object at 0x12016e8d0>
17 : concatenate_1, <keras.layers.merge.Concatenate object at 0x11f41bd90>
18 : reshape3, <keras.layers.core.Reshape object at 0x120182e50>
19 : dense2, <keras.layers.core.Dense object at 0x11f83d150>
20 : softmax, <keras.layers.core.Activation object at 0x1202f4a10>
input {
  name: "data"
  type {
    imageType {
      width: 35
      height: 35
      colorSpace: GRAYSCALE
    }
  }
}
input {
  name: "gru1_h_in"
  type {
    multiArrayType {
      shape: 512
      dataType: DOUBLE
    }
    isOptional: true
  }
}
input {
  name: "gru1_b_h_in"
  type {
    multiArrayType {
      shape: 512
      dataType: DOUBLE
    }
    isOptional: true
  }
}
input {
  name: "gru2_h_in"
  type {
    multiArrayType {
      shape: 512
      dataType: DOUBLE
    }
    isOptional: true
  }
}
input {
  name: "gru2_b_h_in"
  type {
    multiArrayType {
      shape: 512
      dataType: DOUBLE
    }
    isOptional: true
  }
}
output {
  name: "prediction"
  type {
    multiArrayType {
      shape: 12
      dataType: DOUBLE
    }
  }
}
output {
  name: "gru1_h_out"
  type {
    multiArrayType {
      shape: 512
      dataType: DOUBLE
    }
  }
}
output {
  name: "gru1_b_h_out"
  type {
    multiArrayType {
      shape: 512
      dataType: DOUBLE
    }
  }
}
output {
  name: "gru2_h_out"
  type {
    multiArrayType {
      shape: 512
      dataType: DOUBLE
    }
  }
}
output {
  name: "gru2_b_h_out"
  type {
    multiArrayType {
      shape: 512
      dataType: DOUBLE
    }
  }
}


Note: There are three reshape layers in the middle temporarely modify the shape of the layers so that the concatenation goes through the coremltools converter. (Concatenation has to go along a specific axis) May this cause a specific error?


Last note:

Width and height are flipped during conversion, so I have set the dimensions to be a square (35x35). Do I have to make some rotations to produce a valid input?


Any help or suggestion would be appreciated. Thanks!