How to retrieve/save model after and during training

Hi I have been the following WWDC21 "dynamic training on iOS" - I have been able to get the training working, with an output of the iterations etc being printed out in the console as training progresses.

However I am unable to retrieve the checkpoints or result/model once training has completed (or is in progress) nothing in the callback fires.

If I try to create a model from the sessionDirectory - it returns nil (even though training has clearly completed).

Please can someone help or provide pointers on how to access the results/checkpoints so that I can make a MlModel and use it.

var subscriptions = [AnyCancellable]()



        let job = try! MLStyleTransfer.train(trainingData: datasource, parameters: trainingParameters, sessionParameters: sessionParameters)


job.result.sink { result in

            print("result ", result)

        }

        receiveValue: { model in
        try? model.write(to: sessionDirectory)

            let compiledURL = try? MLModel.compileModel(at: sessionDirectory)

            let mlModel = try? MLModel(contentsOf: compiledURL!)

        }

        .store(in: &subscriptions)

This also does not work:

job.checkpoints.sink { checkpoint in
    // Process checkpoint
 let model = MLStyleTransfer(trainingData: checkpoint)
}
.store(in: &subscriptions)



        }

This is the printout in the console:

Using CPU to create model

+--------------+--------------+--------------+--------------+--------------+

| Iteration    | Total Loss   | Style Loss   | Content Loss | Elapsed Time |

+--------------+--------------+--------------+--------------+--------------+

| 1            | 64.9218      | 54.9499      | 9.97187      | 3.92s        |

2022-02-20 15:14:37.056251+0000 DynamicStyle[81737:9175431] [ServicesDaemonManager] interruptionHandler is called. -[FontServicesDaemonManager connection]_block_invoke

| 2            | 61.7283      | 24.6832      | 8.30343      | 9.87s        |

| 3            | 59.5098      | 27.7834      | 11.7603      | 16.19s       |

| 4            | 56.2737      | 16.163       | 10.985       | 22.35s       |

| 5            | 53.0747      | 12.2062      | 12.0783      | 28.08s       |

+--------------+--------------+--------------+--------------+--------------+

Any help would be appreciated on how to retrieve models.

Thanks

Answered by Frameworks Engineer in 705365022

Without knowing how you specified your sessionDirectory, it'd hard to diagnose exactly what went wrong. Having said that, can you try writing the model to a temporary location to see if that retrieves the model?

receiveValue: { model in
    let modelURL = URL(fileURLWithPath: NSTemporaryDirectory()).appendingPathComponent("MyModel.mlmodel")
    do {
        try model.write(to: modelURL)
        let compiledURL = try MLModel.compileModel(at: modelURL)

        // instantiate a mlmodel from the compiled model URL
        let mlmodel = try MLModel(contentsOf: compiledURL)
 
        // make predictions using mlmodel with inputProvider 
        let result = try mlmodel.prediction(from: inputProvider)

    } catch {
        // error handling
    }

The code for instantiating mlmodel and make predictions with it does not have to be in here, as long as the compiled URL can be retrieved.

I am not sure what is error when you try to init from a checkpoint. Can you be more specific?

Accepted Answer

Without knowing how you specified your sessionDirectory, it'd hard to diagnose exactly what went wrong. Having said that, can you try writing the model to a temporary location to see if that retrieves the model?

receiveValue: { model in
    let modelURL = URL(fileURLWithPath: NSTemporaryDirectory()).appendingPathComponent("MyModel.mlmodel")
    do {
        try model.write(to: modelURL)
        let compiledURL = try MLModel.compileModel(at: modelURL)

        // instantiate a mlmodel from the compiled model URL
        let mlmodel = try MLModel(contentsOf: compiledURL)
 
        // make predictions using mlmodel with inputProvider 
        let result = try mlmodel.prediction(from: inputProvider)

    } catch {
        // error handling
    }

The code for instantiating mlmodel and make predictions with it does not have to be in here, as long as the compiled URL can be retrieved.

I am not sure what is error when you try to init from a checkpoint. Can you be more specific?

Hi,

Thank you for the reply.

I am following the tutorial on "Control training in CreateMl with Swift" to set up the training on the device. So to create the sessionsDirectory I used the following code.

let experimentID = "test4"

let sessionDirectory = URL(fileURLWithPath: "\(NSTemporaryDirectory())WWDC-\(experimentID)")

What appears in the console is that the model is trained (iterations are shown to be completed), but then nothing else happens. I wonder if it is a problem with how I am using Combine to retrieve the results and checkpoints - none of the test print statements fire inside the closures, which suggests the code block has not run.

job.result.sink { result in

            print("result", result)

        }

        receiveValue: { model in
 print("result", model)
        }

        .store(in: &subscriptions)

Your above code suggestion did not seem to run either.

I am new to coding, and to combine, so I am sure this is a something simple. Any further help would be appreciated.

Thanks again

Just a quick follow up, it’s is working now - your code works, not sure why it didn’t the first time. I deleted the app and started a new project, and it’s all working now.

one quick question - should I clean/wipe the session directory after each model training, to free up space? And where would the best practice be for storing the complied models?

thanks again for your help, I will hopefully release this feature soon in my app!

How to retrieve/save model after and during training
 
 
Q