I am currently working on a 2D pose estimator. I developed a PyTorch vision transformer based model with 17 joints in COCO format for the same and then converted it to CoreML using CoreML tools version 6.2.
The model was trained on a custom dataset. However, upon running the converted model on iOS, I observed a significant drop in accuracy. You can see it in this video (https://youtu.be/EfGFrOZQGtU) that demonstrates the outputs of the PyTorch model (on the left) and the CoreML model (on the right).
Could you please confirm if this drop in accuracy is expected and suggest any possible solutions to address this issue? Please note that all preprocessing and post-processing techniques remain consistent between the models.
P.S. While converting I also got the following warning. :
TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if x.numel() == 0 and obsolete_torch_version(TORCH_VERSION, (1, 4)):
P.P.S. When we initialize the CoreML model on iOS 17.0, we get this error:
Validation failure: Invalid Pool kernel width (13), must be [1-8] or 20.
Validation failure: Invalid Pool kernel width (9), must be [1-8] or 20.
Validation failure: Invalid Pool kernel width (13), must be [1-8] or 20.
Validation failure: Invalid Pool kernel width (9), must be [1-8] or 20.
Validation failure: Invalid Pool kernel width (13), must be [1-8] or 20.
This neural network model does not have a parameter for requested key 'precisionRecallCurves'. Note: only updatable neural network models can provide parameter values and these values are only accessible in the context of an MLUpdateTask completion or progress handler.
Explore the power of machine learning and Apple Intelligence within apps. Discuss integrating features, share best practices, and explore the possibilities for your app here.
Post
Replies
Boosts
Views
Activity
Hi all, I couldn't use random.PRNGKey to generate random seed. Wondering anyone has similar issue before and figure this out. Here is my current config: jax-metal==0.0.3, jaxlib==0.4.10, jax==0.4.11.
I am using Apple M1 pro.
Does the new Image Playground API allow programmatically generating images? Can the app generate and use them without the API's UI or would that require using another generative image model?
I've been going through the documentation. I can't seem to find the docs that cover all the new AI features.
I have a Shortcuts action via an App Intent that I want only for active subscribers to use.
I have a shared class that handles all the subcription related things. But for some reason my code only works if the app is active in the background. Once the app is quitted and the user performs the Shortcut, the not subscribed error is thrown – even though the user is subscribed.
How can I ensure that my subscription check is done correctly, if the app isn’t open in the background?
My Code
App Intent excerpt:
@MainActor
func perform() async throws -> some IntentResult & ReturnsValue<MeterIntentEntity> {
// Validate that the user is subscribed.
// Cancels action with error message if not subscribed.
if SubscriptionManager.shared.userIsSubscribed == false {
throw IntentError.notSubscribed
}
// More Code …
// Finish and pass created value as result.
return .result(value: something)
}
Subscription Manager excerpt:
class SubscriptionManager: ObservableObject {
// A singleton for our entire app to use
static let shared = SubscriptionManager()
let productIds = ["my_sub1", "my_sub2"]
@Published private(set) var availableSubscriptions: [Product]
@Published private(set) var purchasedSubscriptions: [Product] = []
public var userIsSubscribed: Bool {
return !self.purchasedSubscriptions.isEmpty
}
init() {
// Initialize empty products, and then do a product request asynchronously to fill them in.
availableSubscriptions = []
Task {
await updatePurchasedProducts()
}
}
@MainActor
func updatePurchasedProducts() async {
for await result in Transaction.currentEntitlements {
do {
let transaction = try checkVerified(result)
if let subscription = availableSubscriptions.first(where: { $0.id == transaction.productID }) {
purchasedSubscriptions.append(subscription)
}
} catch {
Logger.subscription.error("Error loading users user's purchased products.")
}
}
}
From https://www.apple.com/newsroom/2024/06/introducing-apple-intelligence-for-iphone-ipad-and-mac/:
Powered by Apple Intelligence, Siri becomes more deeply integrated into the system experience. With richer language-understanding capabilities, Siri is more natural, more contextually relevant, and more personal, with the ability to simplify and accelerate everyday tasks.
From https://developer.apple.com/apple-intelligence/:
Siri is more natural, more personal, and more deeply integrated into the system. Apple Intelligence provides Siri with enhanced action capabilities, and developers can take advantage of pre-defined and pre-trained App Intents across a range of domains to not only give Siri the ability to take actions in your app, but to make your app’s actions more discoverable in places like Spotlight, the Shortcuts app, Control Center, and more. SiriKit adopters will benefit from Siri’s enhanced conversational capabilities with no additional work. And with App Entities, Siri can understand content from your app and provide users with information from your app from anywhere in the system.
Based on this, as well as the video at https://developer.apple.com/videos/play/wwdc2024/10133/ , my understanding is that in order for Siri to be able to execute tasks in applications, those applications must implement the Siri Intents API.
Can someone at Apple please clarify: will it be possible for Siri or some other aspect of Apple Intelligence / Core ML / Create ML to take actions in applications which do not support these APIs (e.g. web apps, Citrix apps, legacy apps)?
Thank you!
I am developing an iOS app that supports INPlayMediaIntent.
We are trying to increase the recognition rate of content names, which are song titles, using AppIntentVocabulary.
As a sample, some extracts are shown below.
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>IntentPhrases</key>
<array>
<dict>
<key>IntentName</key>
<string>INPlayMediaIntent</string>
<key>IntentExamples</key>
<array>
<string>Mezamashi Appで湖畔の朝を再生</string>
<string>湖畔の朝をMezamashi Appで再生して</string>
</array>
</dict>
</array>
<key>ParameterVocabularies</key>
<array>
<dict>
<key>ParameterNames</key>
<array>
<string>INPlayMediaIntent.playlistTitle</string>
</array>
<key>ParameterVocabulary</key>
<array>
<dict>
<key>VocabularyItemIdentifier</key>
<string>ID1</string>
<key>VocabularyItemSynonyms</key>
<array>
<dict>
<key>VocabularyItemPronunciation</key>
<string>aogamagaeru</string>
<key>VocabularyItemPhrase</key>
<string>青ガマガエル</string>
</dict>
</array>
</dict>
<dict>
<key>VocabularyItemIdentifier</key>
<string>ID2</string>
<key>VocabularyItemSynonyms</key>
<array>
<dict>
<key>VocabularyItemPronunciation</key>
<string>kohon no asa</string>
<key>VocabularyItemPhrase</key>
<string>湖畔の朝</string>
</dict>
</array>
</dict>
<dict>
<key>VocabularyItemIdentifier</key>
<string>ID3</string>
<key>VocabularyItemSynonyms</key>
<array>
<dict>
<key>VocabularyItemPronunciation</key>
<string>kumageratachi no uta</string>
<key>VocabularyItemPhrase</key>
<string>クマゲラたちの歌</string>
</dict>
</array>
</dict>
</array>
</dict>
</array>
</dict>
</plist>
When running on the iOS 17.5 simulator in Xcode 15.4, the results are as follows.
mediaName = VocabularyItemIdentifier
mediaIdentifier = nil
<INMediaSearch: 0x6000026212c0> {
reference = 0;
mediaType = 0;
sortOrder = 0;
albumName = <null>;
mediaName = ID1;
genreNames = (
);
artistName = <null>;
moodNames = (
);
releaseDate = <null>;
mediaIdentifier = <null>;
}
However, when running on an iOS 17.5 device, the following applies.
mediaName = VocabularyItemPhrase
mediaIdentifier = VocabularyItemIdentifier
<INMediaSearch: 0x301efd9e0> {
reference = 0;
mediaType = 5;
sortOrder = 0;
albumName = <null>;
mediaName = 青ガマガエル;
genreNames = (
);
artistName = <null>;
moodNames = (
);
releaseDate = <null>;
mediaIdentifier = ID1;
}
The results are not stable, for example, sometimes everything else returns null.
I have tried everything, but it is just taking a long time.
Does anyone have any advice on this?
I am trying to make a voip car play app using siri
let assistant = CPAssistantCellConfiguration(position: .top, visibility: .always, assistantAction: .startCall)
let siriTmeplate = CPListTemplate(title: "Siri", sections: [sectionItems, loadingSection], assistantCellConfiguration: assistant)
siriTmeplate.tabSystemItem = .recents
siriTmeplate.showsTabBadge = false
Using the above code gives me the error
"Error: Intent of type INStartCallIntent is not supported for this app category"
on app luanch
I have INStartCallIntent in my apps info plist and I have all the entitlements and I have "business" as the app category,
I can fine 0 help online with this. what does this error really mean and how can I fix it please
https://developer.apple.com/videos/play/wwdc2024/10159/
This video references demo_utils but I did not see any source code attached to the video. Does anyone have access to it
iOS 18 adds a specific macro for exposing your search app intent, app entities, etc, to siri but how are you meant to add it to your existing objects without removing it entirely from < iOS 18 users?
For example, i get the following error:
AssistantIntent(schema:) is only available in iOS 18 or newer. Add @available attribute to enclosing struct.
I don't want to do that since i still want to support iOS 17 users with my existing shortcuts. Do i need to duplicate my entire shortcuts model to add the new macro?
Are there going to be any sessions on Image Playgrounds API for iOS?
"Explore machine learning on Apple platforms" mentions the writing and points to sessions, but only mentions Image Playground without pointing to sessions.
The What’s New in Create ML session in WWDC24 went into great depth with time-series forecasting models (beginning at: 15:14) and mentioned these new models, capabilities, and tools for iOS 18. So, far, all I can find is API documentation. I don’t see any other session in WWDC24 covering these new time-series forecasting Create ML features.
Is there more substance/documentation on how to use these with Create ML? Maybe I am looking in the wrong place but I am fairly new with ML.
Are there any food truck / donut shop demo/sample code like in the video?
It is of great interest to get ahead of the curve on this within business applications that may take advantage of this with inventory / ordering data.
After watching the What's new in App Intents session I'm attempting to create an intent conforming to URLRepresentableIntent. The video states that so long as my AppEntity conforms to URLRepresentableEntity I should not have to provide a perform method . My application will be launched automatically and passed the appropriate URL.
This seems to work in that my application is launched and is passed a URL, but the URL is in the form: FeatureEntity/{id}.
Am I missing something, or is there a trick that enables it to pass along the URL specified in the AppEntity itself?
struct MyExampleIntent: OpenIntent, URLRepresentableIntent {
static let title: LocalizedStringResource = "Open Feature"
static var parameterSummary: some ParameterSummary {
Summary("Open \(\.$target)")
}
@Parameter(title: "My feature", description: "The feature to open.")
var target: FeatureEntity
}
struct FeatureEntity: AppEntity {
// ...
}
extension FeatureEntity: URLRepresentableEntity {
static var urlRepresentation: URLRepresentation {
"https://myurl.com/\(.id)"
}
}
The Translation API introduced at Session 10117 is impressive, but limiting it to SwiftUI is restrictive.
This API works great in the demo, but for more complex apps, it lacks flexibility because it is bound to SwiftUI Views.
Please consider making it available in non-SwiftUI environments.
I created a model that classifies certain objects using yolov8. I noticed that the model is not working properly in my application. While the model works fine in Xcode preview, in the application it either returns the same result with 99% accuracy for each classification or does not provide any result.
In Preview it looks like this:
Predictions:
extension CameraVC : AVCapturePhotoCaptureDelegate {
func photoOutput(_ output: AVCapturePhotoOutput, didFinishProcessingPhoto photo: AVCapturePhoto, error: (any Error)?) {
guard let data = photo.fileDataRepresentation() else {
return
}
guard let image = UIImage(data: data) else {
return
}
guard let cgImage = image.cgImage else {
fatalError("Unable to create CIImage")
}
let handler = VNImageRequestHandler(cgImage: cgImage,orientation: CGImagePropertyOrientation(image.imageOrientation))
DispatchQueue.global(qos: .userInitiated).async {
do {
try handler.perform([self.viewModel.detectionRequest])
} catch {
fatalError("Failed to perform detection: \(error)")
}
}
lazy var detectionRequest: VNCoreMLRequest = {
do {
let model = try VNCoreMLModel(for: bestv720().model)
let request = VNCoreMLRequest(model: model) { [weak self] request, error in
self?.processDetections(for: request, error: error)
}
request.imageCropAndScaleOption = .centerCrop
return request
} catch {
fatalError("Failed to load Vision ML model: \(error)")
}
}()
This is where i print recognized objects:
func processDetections(for request: VNRequest, error: Error?) {
DispatchQueue.main.async {
guard let results = request.results as? [VNRecognizedObjectObservation] else {
return
}
var label = ""
var all_results = []
var all_confidence = []
var true_results = []
var true_confidence = []
for result in results {
for i in 0...results.count{
all_results.append(result.labels[i].identifier)
all_confidence.append(result.labels[i].confidence)
for confidence in all_confidence {
if confidence as! Float > 0.7 {
true_results.append(result.labels[i].identifier)
true_confidence.append(confidence)
}
}
}
label = result.labels[0].identifier
}
print("True Results " , true_results)
print("True Confidence ", true_confidence)
self.output?.updateView(label:label)
}
}
I converted the model like this:
from ultralytics import YOLO
model = YOLO(model_path)
model.export(format='coreml', nms=True, imgsz=[720,1280])
Heyy, on my Iphone 15 Pro, i cant use the new Siri. It shows me the old design of Siri. I got IOS 18.0 Developer Beta. I got no clue, why it dont works for me.
We are currently working on implementing a baby cry detection model in the frontend of our app but have encountered some challenges with the mel spectrogram transformation.
Our mel spectrogram class, developed in python, leverages librosa for generating mel spectrograms (librosa.feature.melspectrogram and librosa.power_to_db). While we have successfully exported the model to a .mlmodel file, the results we obtain in Swift differ significantly from those generated by our Python code.
Could this discrepancy be due to the use of librosa in Python, which might not be directly compatible with Swift? Or should the transformation process be inherently consistent once exported to a .mlmodel file?
My App has several resources that I'd like to spring open through App Intents. For example a series of Dictionaries. These resources however in the app are behind a log in (for security) and are entitlements that are purchased. They may own 4 of 7 dictionaries.
If I want to have an intent that says, "Open Dictionary: (Dict Name)" how do I best handle situations where the user may no longer be logged in or have the entitlement for that specific dictionary?
Thanks
I'm trying to convert a TensorFlow model that I didn't create and know approximately nothing about to CoreML so that I can use it in some functional tests. I can't tell you much about the model, but you can read about it on the blog from the team that created it: https://research.google/blog/improving-mobile-app-accessibility-with-icon-detection/
I can't convert this model to a TensorFlow Lite model because it uses a few full TensorFlow operations (which I could work around) and it exceeds the 4-tensor output limit (which I can't, AFAIK). So instead, I'm trying to convert the model to CoreML so that I can run it on-device.
The issue I'm running into is that every approach fails in different ways. If I load the model with tf.saved_model.load and pass that as the first parameter to the convert call, it says
NotImplementedError: Expected model format: [SavedModel | concrete_function | tf.keras.Model | .h5 | GraphDef], got <tensorflow.python.trackable.autotrackable.AutoTrackable object at 0x30d90c250>
If I pass model.signatures['serving_default'] as the first parameter to convert, I get
NotImplementedError: Expected model format: [SavedModel | concrete_function | tf.keras.Model | .h5 | GraphDef], got ConcreteFunction [...a page or two of info about the function here...]
If I try to wrap it in a Keras layer using the instructions provided in the converter, it fails because a sequential model can't have multiple outputs.
If I try to use a tf.keras.layers.TFSMLayer to load the model, it fails because there are multiple tags, and there's no way to specify tags when constructing the layer. (It tells me that I need to add 'tags' to load the model, but if I do that, it tells me that tags isn't a valid parameter to the call.)
If I load the model with tf.saved_model.load and specify a single tag, then re-save it in a different location with tf.saved_model.save to generate a new model with only a single tag, then do
input_layer = tf.keras.Input(shape=(768, 768, 3), dtype="int8")
layer = tf.keras.layers.TFSMLayer("./serve_model", call_endpoint='serving_default')
outputs = layer(input_layer)
model = tf.keras.Model(input_layer, outputs)
I get
AttributeError: 'Functional' object has no attribute '_get_save_spec'
At one point, I also tried this:
class LayerFromSavedModel(tf.keras.layers.Layer):
def __init__(self):
super(LayerFromSavedModel, self).__init__()
self.vars = legacy_model.variables
def call(self, inputs):
return legacy_model.signatures['serving_default'](inputs)
input = tf.keras.Input(shape=(3000, 3000, 3))
model = tf.keras.Model(input, LayerFromSavedModel()(input))
and saw a similar failure.
I've run out of ideas here. Is there simply no support whatsoever in the converter for importing a TensorFlow 2 SavedModel into CoreML, or am I missing something fundamental?
Hello. Where can I find some examples on creating custom genmojis in Swift and reusing it in an App?