Apple Developer Forums

CreateML json format

I'm trying to generate a json for my training data, tried manually first and then tried using roboflow and I still get the same error: _annotations.createml.json file contains field "Index 0" that is not of type String. the json format provided by roboflow was [{"image":"menu1_jpg.rf.44dfacc93487d5049ed82952b44c81f7.jpg","annotations":[{"label":"100","coordinates":{"x":497,"y":431.5,"width":32,"height":10}}]}] any help would be greatly appreciated

Machine Learning & AI Create ML

2

0

595

Oct ’24

SFSpeechRecognitionResult discards previous transcripts with on-device option set to true

Hi everyone, I might need some help with on-device recognition. It seems that the speech recognition task will discard whatever it has transcribed after a new sentence starts (or it believes it becomes a new sentence) during a single audio session, with requiresOnDeviceRecognition is set to true. This doesn't happen with requiresOnDeviceRecognition set to false. System environment: macOS 14 with Xcode 15, deploying to iOS 17 Thank you all!

Machine Learning & AI General Speech

13

4

1.7k

Jun ’23

Apple Intelligence iOS download stuck at 99%?

Hey guys, If you are still having issues on having your download stuck at 99% and switching on and off WiFi, I believe I have a fix. turn your cellular data completely off. It will force the device to just use wifi. Then the download will restart and actually complete.

Machine Learning & AI Apple Intelligence Apple Intelligence

1

1.2k

Oct ’24

Do BLAS and LAPACK functions use Apple Silicon features

I can use BLAS and LAPACK functions via the Accelerate framework to perform vector and matrix arithmetic and linear algebra calculations. But do these functions take advantage of Apple Silicon features?

Machine Learning & AI General Accelerate

3

0

523

Oct ’24

Many inputs to `MPSNNGraph::encodeBatchToCommandBuffer`

I understand we can use MPSImageBatch as input to [MPSNNGraph encodeBatchToCommandBuffer: ...] method. That being said, all inputs to the MPSNNGraph need to be encapsulated in a MPSImage(s). Suppose I have an machine learning application that trains/infers on thousands of input data where each input has 4 feature channels. Metal Performance Shaders is chosen as the primary AI backbone for real-time use. Due to the nature of encodeBatchToCommandBuffer method, I will have to create a MTLTexture first as a 2D texture array. The texture has pixel width of 1, height of 1 and pixel format being RGBA32f. The general set up will be: #define NumInputDims 4 MPSImageBatch * infBatch = @[]; const uint32_t totalFeatureSets = N; // Each slice is 4 (RGBA) channels. const uint32_t totalSlices = (totalFeatureSets * NumInputDims + 3) / 4; MTLTextureDescriptor * descriptor = [MTLTextureDescriptor texture2DDescriptorWithPixelFormat: MTLPixelFormatRGBA32Float width: 1 height: 1 mipmapped: NO]; descriptor.textureType = MTLTextureType2DArray descriptor.arrayLength = totalSlices; id<MTLTexture> texture = [mDevice newTextureWithDescriptor: descriptor]; // bytes per row is `4 * sizeof(float)` since we're doing one pixel of RGBA32F. [texture replaceRegion: MTLRegionMake3D(0, 0, 0, 1, 1, totalSlices) mipmapLevel: 0 withBytes: inputFeatureBuffers[0].data() bytesPerRow: 4 * sizeof(float)]; MPSImage * infQueryImage = [[MPSImage alloc] initWithTexture: texture featureChannels: NumInputDims]; infBatch = [infBatch arrayByAddingObject: infQueryImage]; The training/inference will be: MPSNNGraph * mInferenceGraph = /*some MPSNNGraph setup*/; MPSImageBatch * returnImage = [mInferenceGraph encodeBatchToCommandBuffer: commandBuffer sourceImages: @[infBatch] sourceStates: nil intermediateImages: nil destinationStates: nil]; // Commit and wait... // Read the return image for the inferred result. As you can see, the setup is really ad hoc - a lot of 1x1 pixels just for this sole purpose. Is there any better way I can achieve the same result while still on Metal Performance Shaders? I guess a further question will be: can MPS handle general machine learning cases other than CNN? I can see the APIs are revolved around convolution network, both from online documentations and header files. Any response will be helpful, thank you.

Machine Learning & AI General Metal Performance Shaders Objective-C

0

266

Oct ’24

Does the Random Number Generation (RGN) process change over different OS versions?

Hi everyone! I appreciate your help. I am a researcher and I use UMAP to cluster my data. Reproducibility is a key requirement for my field, so I set a random seed for reproducibility. After coming back to my project after some time, I do not get the same results than previously even though I am working in a virtual environment, which I did not change. When pondering about the reasons, I remembered that I upgraded my OS from Sonoma 14.1.1 to 14.5, so I was wondering whether the change in OS might cause those issues. I'm sorry if this question is obvious to developer folks, but before I downgrade my OS or create a virtual machine, any tipp is much appreciated. Thank you!

Machine Learning & AI General

2

0

227

Oct ’24

Training data "isn't in the correct format"

Hi folks, I'm trying to import data to train a model and getting the above error. I'm using the latest Xcode, have double checked the formatting in the annotations file, and used jpgrepair to remove any corruption from the data files. Next step is to try a different dataset, but is this a particular known error? (Or am I doing something obviously wrong?) 2019 Intel Mac, Xcode 15.4, macOS Sonoma 14.1.1 Thanks

Machine Learning & AI Create ML

1

0

219

Oct ’24

Genmoji developer support

Trying to experiment with Genmoji per the WWDC documentation and samples, but I don't seem to get Genmoji keyboard. I see this error in my log: Received port for identifier response: <(null)> with error:Error Domain=RBSServiceErrorDomain Code=1 "Client not entitled" UserInfo={RBSEntitlement=com.apple.runningboard.process-state, NSLocalizedFailureReason=Client not entitled, RBSPermanent=false} elapsedCPUTimeForFrontBoard couldn't generate a task port Is anything presently supported for developers? All I have done here is a simple app with a UITextView and code for: textView.supportsAdaptiveImageGlyph = true Any thoughts?

Machine Learning & AI Apple Intelligence

0

239

Sep ’24

[NewbQs] Is this possible with AppIntentDomains?

As a user, when viewing a photo or image, I want to be able to tell Siri, “add this to ”, similar to example from the WWDC presentation where a photo is added to a note in the notes app. Is this... possible with app domains as they are documented? I see domains like open-file and open-photo, but I don't know if those are appropriate for this kind of functionality?

Machine Learning & AI Apple Intelligence App Intents

1

0

377

Sep ’24

Issue with OCR on Swift iOS App: Roboflow API Bounding Boxes Missing After Response

Hi everyone, I'm working on an iOS app built in Swift using Xcode, where I'm integrating Roboflow's object detection API to extract items from grocery receipts. My goal is to identify key information (like items, total, tax, etc.) from the images of these receipts. I'm successfully sending images to the Roboflow API and receiving predictions with bounding box data, but when I attempt to extract text from the detected regions (bounding boxes), it appears that the text extraction is failing—no text is being recognized. The issue seems to be that the bounding boxes are either not properly being handled or something is going wrong in the way I process the API response. Here's a brief breakdown of what I'm doing: The image is captured, converted to base64, and sent to the Roboflow API. The API response comes back with bounding boxes for the detected elements (items, date, subtotal, etc.). The problem occurs when I try to extract the text from the image using the bounding box data—it seems like the bounding boxes are being found, but no text is returned. I suspect the issue might be happening because the app’s segue to the results view controller is triggered before the OCR extraction completes, or there might be a problem in my code handling the bounding box response. Response Data: { "inference_id": "77134cce-91b5-4600-a59b-fab74350ca06", "time": 0.09240847699993537, "image": { "width": 370, "height": 502 }, "predictions": [ { "x": 163.5, "y": 250.5, "width": 313.0, "height": 127.0, "confidence": 0.9357666373252869, "class": "Item", "class_id": 1, "detection_id": "753341d5-07b6-42a1-8926-ecbc61128243" }, { "x": 52.5, "y": 417.5, "width": 89.0, "height": 23.0, "confidence": 0.8819760680198669, "class": "Date", "class_id": 0, "detection_id": "b4681149-d538-47b1-8700-d9528bf1daa0" }, ... ] } And the log showing bounding boxes: Prediction: ["width": 313, "y": 250.5, "x": 163.5, "detection_id": 753341d5-07b6-42a1-8926-ecbc61128243, "class": Item, "height": 127, "confidence": 0.9357666373252869, "class_id": 1] No bounding box found in prediction. I've double-checked the bounding box coordinates, and everything seems fine. Does anyone have experience with using OCR alongside object detection APIs in Swift? Any help on how to ensure the bounding boxes are properly processed and used for OCR would be greatly appreciated! Also, would it help to delay the segue to the results view controller until OCR is complete? Thank you!

Machine Learning & AI General Swift Vision UIKit Core ML

0

333

Sep ’24

The Vision request does not work in simulator with Error "Could not create inference context"

When I use VNGenerateForegroundInstanceMaskRequest to generate the mask in the simulator by SwiftUI, there is an error "Could not create inference context". Then I add the code to make the vision by CPU: let request = VNGenerateForegroundInstanceMaskRequest() let handler = VNImageRequestHandler(ciImage: inputImage) #if targetEnvironment(simulator) if #available(iOS 18.0, *) { let allDevices = MLComputeDevice.allComputeDevices for device in allDevices { if(device.description.contains("MLCPUComputeDevice")){ request.setComputeDevice(.some(device), for: .main) break } } } else { // Fallback on earlier versions request.usesCPUOnly = true } #endif do { try handler.perform([request]) if let result = request.results?.first { let mask = try result.generateScaledMaskForImage(forInstances: result.allInstances, from: handler) return CIImage(cvPixelBuffer: mask) } } catch { print(error) } Even I force the simulator to run the code by CPU, but it still have the error: "Could not create inference context"

Machine Learning & AI General Vision Machine Learning

2

0

342

Sep ’24

Status on TF-Metal

The metal plugin for TensorFlow had its GitHub repo taken down, and on pypi, the last update was a year ago for TF 2.14. What's the status on the metal plugin? For now it seems to work fine for TF 2.15 but what's the plan for the future?

Machine Learning & AI General tensorflow-metal

0

7

403

Sep ’24

Apple Music EQ settings

Was just wondering, not sure if anyone else had thought about this. but different sound output device have different mechanism of sound throw. can we not put in something which can go into bluetooth settings and overseeing if it is a music device connected would automatically set the EQ differently( as per user requirement) So its somewhat like each music device would have specific music EQ stored for the same which can be recognized via bluetooth.

Machine Learning & AI General

1

0

266

Sep ’24

Can Writing Tools be accessed In UITableView contextMenu?

I’m currently developing an app that features a main view with a UITableView. When users select a row, they are navigated to a detail view that contains a UITextField. This UITextField already supports Writing Tools. My question is: When a user long-presses a UITableView cell, is it possible to add a Writing Tools option to the Context Menu, allowing users to interact with the Writing Tools more conveniently?like Summary detail text

Machine Learning & AI Apple Intelligence Apple Intelligence

0

306

Sep ’24

CoreML, Invalid indexing on GPU

i believe i am encountering a bug in the MPS backend of CoreML. i believe there is an invalid conversion of a slice_by_index + gather operation resulting in indexing the wrong values on GPU execution. the following is a python program using the coremltools library illustrating the issue: from coremltools.converters.mil import Builder as mb from coremltools.converters.mil.mil import types dB = 20480 shapeI = (2, dB) shapeB = (dB, 22) @mb.program(input_specs=[mb.TensorSpec(shape=shapeI, dtype=types.int32), mb.TensorSpec(shape=shapeB)]) def prog(i, b): lslice = mb.slice_by_index(x=i, begin=[0, 0], end=[1, dB], end_mask=[False, True], squeeze_mask=[True, False], name='slice_left') rslice = mb.slice_by_index(x=i, begin=[1, 0], end=[2, dB], end_mask=[False, True], squeeze_mask=[True, False], name='slice_right') ldata = mb.gather(x=b, indices=lslice) rdata = mb.gather(x=b, indices=rslice) # actual bug in optimization of gather+slice x = mb.add(x=ldata, y=rdata) # dummy ops to make a bigger graph to run on GPU x = mb.mul(x=x, y=2.) x = mb.mul(x=x, y=.5) x = mb.mul(x=x, y=2.) x = mb.mul(x=x, y=.5) x = mb.mul(x=x, y=2.) x = mb.mul(x=x, y=.5) x = mb.mul(x=x, y=2.) x = mb.mul(x=x, y=.5) x = mb.mul(x=x, y=2.) x = mb.mul(x=x, y=.5) x = mb.mul(x=x, y=2.) x = mb.mul(x=x, y=.5) x = mb.mul(x=x, y=2.) x = mb.mul(x=x, y=.5) x = mb.mul(x=x, y=1., name='result') return x input_types = [ ct.TensorType(name="i", shape=shapeI, dtype=np.int32), ct.TensorType(name="b", shape=shapeB, dtype=np.float32), ] with tempfile.TemporaryDirectory() as tmpdirname: model_cpu = ct.convert(prog, inputs=input_types, compute_precision=ct.precision.FLOAT32, compute_units=ct.ComputeUnit.CPU_ONLY, package_dir=tmpdirname + 'model_cpu.mlpackage') model_gpu = ct.convert(prog, inputs=input_types, compute_precision=ct.precision.FLOAT32, compute_units=ct.ComputeUnit.CPU_AND_GPU, package_dir=tmpdirname + 'model_gpu.mlpackage') inputs = { "i": torch.randint(0, shapeB[0], shapeI, dtype=torch.int32), "b": torch.rand(shapeB, dtype=torch.float32), } cpu_output = model_cpu.predict(inputs) gpu_output = model_gpu.predict(inputs) # equivalent to prog expected = inputs["b"][inputs["i"][0]] + inputs["b"][inputs["i"][1]] # what actually happens on GPU actual = inputs["b"][inputs["i"][0]] + inputs["b"][inputs["i"][0]] print(f"diff expected vs cpu: {np.sum(np.absolute(expected - cpu_output['result']))}") print(f"diff expected vs gpu: {np.sum(np.absolute(expected - gpu_output['result']))}") print(f"diff actual vs gpu: {np.sum(np.absolute(actual - gpu_output['result']))}") the issue seems to occur in the slice_right + gather operations when executed on GPU. the wrong items in input "i" are selected. the program outpus diff expected vs cpu: 0.0 diff expected vs gpu: 150104.015625 diff actual vs gpu: 0.0 this behavior has been tested on MacBook Pro 14inches 2023, (M2 pro) on mac os 14.7, using coremltools 8.0b2 with python 3.9.19

Machine Learning & AI Core ML Core ML

3

0

399

Sep ’24

Image Playground API

Does the new Image Playground API allow programmatically generating images? Can the app generate and use them without the API's UI or would that require using another generative image model?

Machine Learning & AI General

3

12

3.4k

Jun ’24

AI Image Playground

I need to add AI Image Playground in my iOS app with UIKit, as per WWDC 2024 introduce new AI Image Playground API, I didn't find any official document yet, So how can add it ?

Machine Learning & AI Apple Intelligence UIKit

1

0

730

Sep ’24

iOS18 using VNRecognizeTextRequest2 but VNRecognizeTextRequest3 used

VNRecognizeTextRequest2 did not recognize the upside down text of English text. VNRecognizeTextRequest3 can recognize the text even if English text is upside down. Till iOS 17, I can select VNRecognizeTextRequest2 or VNRecognizeTextRequest3 in my code which is minimum build is iOS16 when I need upside down text detection required.. But on iOS18, even if I set the VNRecognizeTextRequest2 in my code, result seems to be based on the VNRecognizeTextRequest3 because upside down text is detected. VNRecognizeTextRequest2 was deplicant on iOS18, I know. How can I recognize the observation result is upside down or not? Are there any solution with VNRecognizeTextRequest3?

Machine Learning & AI General

0

283

Sep ’24

iOS 18 Beta - Proper error code is not given by TranslationError

All errors in TranslationError return the same error code, making it difficult to differentiate between them. How can this issue be resolved?

Machine Learning & AI Core ML Swift Student Challenge iOS Machine Learning Core ML

1

0

413

Aug ’24

The CoreML MultiArray Float16 input is not supported for running on the NPU, and this issue only occurs on the iPhone 11.

Xcode Version: Version 15.2 (15C500b) com.github.apple.coremltools.source: torch==1.12.1 com.github.apple.coremltools.version: 7.2 Compute: Mixed (Float16, Int32) Storage: Float16 The input to the mlpackage is MultiArray (Float16 1 × 1 × 544 × 960) The flexibility is: 1 × 1 × 544 × 960 | 1 × 1 × 384 × 640 | 1 × 1 × 736 × 1280 | 1 × 1 × 1088 × 1920 I tested this on iPhone XR, iPhone 11, iPhone 12, iPhone 13, and iPhone 14. On all devices except the iPhone 11, the model runs correctly on the NPU. However, on the iPhone 11, the model runs on the CPU instead. Here is the CoreMLTools conversion code I used: mlmodel = ct.convert(trace, inputs=[ct.TensorType(shape=input_shape, name="input", dtype=np.float16)], outputs=[ct.TensorType(name="output", dtype=np.float16, shape=output_shape)], convert_to='mlprogram', minimum_deployment_target=ct.target.iOS16 )

Machine Learning & AI Core ML iPhone iOS Core ML

3

0

504

Sep ’24

Machine Learning & AI

Post

Replies

Boosts

Views

Activity