Cannot squeeze a dimension whose value is not 1

I have an mlmodel based on pytorch-pretrained-BERT, exported via ONNX to CoreML. That process was pretty smooth, so now I'm trying to do some (very) basic testing—i.e., just to make some kind of prediction, and get a rough idea of what performance problems we might encounter.


Howver, when I try to run prediction, I get the following error:


[espresso] [Espresso::handle_ex_plan] exception=Espresso exception: "Invalid state": Cannot squeeze a dimension whose
value is not 1: shape[1]=128 stat2020-02-16 
11:36:05.959261-0800 Spliqs[6725:2140794] [coreml] Error computing NN outputs -5


Is this error indicating a problem with the model itself (i.e., from model conversion), or is there something in Swift/CoreML-land that I'm doing wrong? My prediction function looks like this:


public func prediction(withInput input: String) -> MLMultiArray? {
        var predictions: MLMultiArray? = nil
        if let drummer = drummerBertMLModel {
            var ids = tokenizer.tokenizeToIds(text: input)
            while ids.count < 128 {
                ids.append(1)
            }
            let segMask = Array<Int>(repeating: 0, count: ids.count)
            let inputMLArray = MLMultiArray.from(ids, dims: 2)
            let segMaskMLArray = MLMultiArray.from(segMask, dims: 2)
            let modelInput = spliqs_bert_fp16Input(input_1: inputMLArray, input_3: segMaskMLArray)
            var modelOutput: spliqs_bert_fp16Output? = nil
            do {
                modelOutput = try drummer.prediction(input: modelInput)
            } catch {
                print("Error running prediction on drummer: \(error)")
            }
            if let modelOutput = modelOutput {
                predictions = modelOutput._1139
            }
        }
        return predictions
    }


I'm not trying to do anything with this, at this stage, just getting it running.


I used the pytorch-pretrained-BERT becuase I was able to find a ground-up pretraining example. But I have since noticed that Huggingface has released a "from scratch" training option (just a couple of days ago), so I am happy to move over to that, if the general consensus is that my current approach is likely to be a dead-end.


Any thoughts appreciated.


J.