ML Compute

RSS for tag

Accelerate training and validation of neural networks using the CPU and GPUs.

Posts under ML Compute tag

38 Posts
Sort by:

Post

Replies

Boosts

Views

Activity

Segmentation Fault in np.matmul on macOS 15.2 with Accelerate BLAS
I'm encountering a segmentation fault when using np.matmul with relatively small arrays on macOS 15.2. The issue only occurs in specific scenarios and results in a crash with the following error: Exception Type: EXC_BAD_ACCESS (SIGSEGV) Exception Codes: KERN_INVALID_ADDRESS at 0x0000000000000110 Termination Reason: Namespace SIGNAL, Code 11 Segmentation fault: 11 Full error log: Gist link The crash consistently occurs on a specific line where np.matmul is called, despite similar np.matmul operations succeeding earlier in the same script. The issue cannot be reproduced in a separate script that contains identical operations. When I build the NumPy wheel using OpenBLAS, this issue no longer arises, which leads me to believe that it is related to a problem with Accelerate. Environment NumPy Version: 2.1.3 Python Version: 3.12.7 OS Version: macOS 15.2 BLAS Configuration: Build Dependencies: blas: detection method: system found: true include directory: unknown lib directory: unknown name: accelerate openblas configuration: unknown pc file directory: unknown version: unknown lapack: detection method: system found: true include directory: unknown lib directory: unknown name: accelerate openblas configuration: unknown pc file directory: unknown version: unknown Compilers: c: commands: cc linker: ld64 name: clang version: 15.0.0 c++: commands: c++ linker: ld64 name: clang version: 15.0.0 cython: commands: cython linker: cython name: cython version: 3.0.11 Machine Information: build: cpu: aarch64 endian: little family: aarch64 system: darwin host: cpu: aarch64 endian: little family: aarch64 system: darwin
1
0
100
5d
Troubleshooting Apple Vision Framework Errors
When working on the project "Analyzing a Selfie and Visualizing Its Content" from Apple's documentation, I downloaded the project and opened it in Xcode. However, I encountered the following error: VTEST: error: perform(_:): inside 'for await result in resultStream' error: internalError("Error Domain=com.apple.Vision Code=9 \"Could not create inference context\" UserInfo={NSLocalizedDescription=Could not create inference context}") VTEST: error: DetectFaceRectanglesRequest was cancelled. VTEST: error: DetectFaceRectanglesRequest was cancelled. Error Domain=com.apple.Vision Code=9 "Could not create inference context" UserInfo={NSLocalizedDescription=Could not create inference context} How can I resolve this issue? Thanks in advance!
0
0
71
1w
Broken compatibility in tensorflow-metal with tensorflow 2.18
Issue type: Bug TensorFlow metal version: 1.1.1 TensorFlow version: 2.18 OS platform and distribution: MacOS 15.2 Python version: 3.11.11 GPU model and memory: Apple M2 Max GPU 38-cores Standalone code to reproduce the issue: import tensorflow as tf if __name__ == '__main__': gpus = tf.config.experimental.list_physical_devices('GPU') print(gpus) Current behavior Apple silicone GPU with tensorflow-metal==1.1.0 and python 3.11 works fine with tensorboard==2.17.0 This is normal output: /Users/mspanchenko/anaconda3/envs/cryptoNN_ml_core/bin/python /Users/mspanchenko/VSCode/cryptoNN/ml/core_second_window/test_tensorflow_gpus.py [PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')] Process finished with exit code 0 But if I upgrade tensorflow to 2.18 I'll have error: /Users/mspanchenko/anaconda3/envs/cryptoNN_ml_core/bin/python /Users/mspanchenko/VSCode/cryptoNN/ml/core_second_window/test_tensorflow_gpus.py Traceback (most recent call last): File "/Users/mspanchenko/VSCode/cryptoNN/ml/core_second_window/test_tensorflow_gpus.py", line 1, in <module> import tensorflow as tf File "/Users/mspanchenko/anaconda3/envs/cryptoNN_ml_core/lib/python3.11/site-packages/tensorflow/__init__.py", line 437, in <module> _ll.load_library(_plugin_dir) File "/Users/mspanchenko/anaconda3/envs/cryptoNN_ml_core/lib/python3.11/site-packages/tensorflow/python/framework/load_library.py", line 151, in load_library py_tf.TF_LoadLibrary(lib) tensorflow.python.framework.errors_impl.NotFoundError: dlopen(/Users/mspanchenko/anaconda3/envs/cryptoNN_ml_core/lib/python3.11/site-packages/tensorflow-plugins/libmetal_plugin.dylib, 0x0006): Symbol not found: __ZN3tsl8internal10LogMessageC1EPKcii Referenced from: <D2EF42E3-3A7F-39DD-9982-FB6BCDC2853C> /Users/mspanchenko/anaconda3/envs/cryptoNN_ml_core/lib/python3.11/site-packages/tensorflow-plugins/libmetal_plugin.dylib Expected in: <2814A58E-D752-317B-8040-131217E2F9AA> /Users/mspanchenko/anaconda3/envs/cryptoNN_ml_core/lib/python3.11/site-packages/tensorflow/python/_pywrap_tensorflow_internal.so Process finished with exit code 1
1
2
266
1w
How to confirm whether CreatML is training
I am currently training a Tabular Classification model in CreatML. The dataset comprises 30 features, including 1,000,000 training data points and 1,000,000 verification data points. Could you please estimate the approximate training time for an M4Max MacBook Pro? During the training process, CreatML has been displaying the “Processing” status, but there is no progress bar. I would like to ascertain whether the training is still ongoing, as I have often suspected that it has ceased.
1
0
241
2w
WebGPU Enabled but WKWebView doesn't have GPU Access
We enabled WebGPU feature flag on Safari on iOS 18.2. This does give Safari an access to GPU but WKWebView still doesn't have GPU access. Can WKWebView not access GPU through Safari feature flag? Is there some other mechanism through which we can enable GPU access for WKWebView? We are testing gpu access by loading : https://webgpureport.org/ Regards Saalis Umer Microsoft Safari Feature Flag - webgpu = true Safari GPU Access: WKWebView GPU Access:
0
0
247
Dec ’24
The yolo11 object detection model I exported to coreml stopped working in macOS15.2 beta.
After updating to macOS15.2beta, the Yolo11 object detection model exported to coreml outputs incorrect and abnormal bounding boxes. It also doesn't work in iOS apps built on a 15.2 mac. The same model worked fine on macOS14.1. When training a Yolo11 custom model in Python, exporting it to coreml, and testing it in the preview tab of mlpackage on macOS15.2 and Xcode16.0, the above result is obtained.
4
1
413
2w
Unable to Use M1 Mac Pro Max GPU for TensorFlow Model Training
Hi Everyone, I'm currently facing an issue where TensorFlow is unable to detect the GPU on my M1 Mac for model training. When I run the following code to check for available GPUs: import tensorflow as tf print("Num GPUs Available: ", len(tf.config.list_physical_devices('GPU'))) Num GPUs Available: 0 I have already applied the steps mentioned in the developer apple document. https://developer.apple.com/metal/tensorflow-plugin/ System Information: Device: M1 Mac Pro Max Python Version: 3.12.2 TensorFlow Version: 2.17.0 OS: macOS Sequoia (15.1) Questions: Is there any additional configuration required to enable GPU support on M1 Macs? Are there specific TensorFlow versions that I should be using for better compatibility? Has anyone else faced this issue, and how did you resolve it?
0
1
478
Nov ’24
macOS 15.x crashes in MetalPerformanceShadersGraph
In our app we use CoreML. But ever since macOS 15.x was released we started to get a great bunch of crashes like this: Incident Identifier: 424041c3-884b-4e50-bb5a-429a83c3e1c8 CrashReporter Key: B914246B-1291-4D44-984D-EDF84B52310E Hardware Model: Mac14,12 Process: <REMOVED> [1509] Path: /Applications/<REMOVED> Identifier: com.<REMOVED> Version: <REMOVED> Code Type: arm64 Parent Process: launchd [1] Date/Time: 2024-11-13T13:23:06.999Z Launch Time: 2024-11-13T13:22:19Z OS Version: Mac OS X 15.1.0 (24B83) Report Version: 104 Exception Type: SIGABRT Exception Codes: #0 at 0x189042600 Crashed Thread: 36 Thread 36 Crashed: 0 libsystem_kernel.dylib 0x0000000189042600 __pthread_kill + 8 1 libsystem_c.dylib 0x0000000188f87908 abort + 124 2 libsystem_c.dylib 0x0000000188f86c1c __assert_rtn + 280 3 Metal 0x0000000193fdd870 MTLReportFailure.cold.1 + 44 4 Metal 0x0000000193fb9198 MTLReportFailure + 444 5 MetalPerformanceShadersGraph 0x0000000222f78c80 -[MPSGraphExecutable initWithMPSGraphPackageAtURL:compilationDescriptor:] + 296 6 Espresso 0x00000001a290ae3c E5RT::SharedResourceFactory::GetMPSGraphExecutable(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, NSDictionary*) + 932 . . . 43 CoreML 0x0000000192d263bc -[MLModelAsset modelWithConfiguration:error:] + 120 44 CoreML 0x0000000192da96d0 +[MLModel modelWithContentsOfURL:configuration:error:] + 176 45 <REMOVED> 0x000000010497b758 -[<REMOVED> <REMOVED>] (<REMOVED>) No similar crashes on macOS 12-14! MetalPerformanceShadersGraph.log Any clue what is causing this? Thanks! :)
2
1
437
Dec ’24
How to Fine-Tune the SNSoundClassifier for Custom Sound Classification in iOS?
Hi Apple Developer Community, I’m exploring ways to fine-tune the SNSoundClassifier to allow users of my iOS app to personalize the model by adding custom sounds or adjusting predictions. While Apple’s WWDC session on sound classification explains how to train from scratch, I’m specifically interested in using SNSoundClassifier as the base model and building/fine-tuning on top of it. Here are a few questions I have: 1. Fine-Tuning on SNSoundClassifier: Is there a way to fine-tune this model programmatically through APIs? The manual approach using macOS, as shown in this documentation is clear, but how can it be done dynamically - within the app for users or in a cloud backend (AWS/iCloud)? Are there APIs or classes that support such on-device/cloud-based fine-tuning or incremental learning? If not directly, can the classifier’s embeddings be used to train a lightweight custom layer? Training is likely computationally intensive and drains too much on battery, doing it on cloud can be right way but need the right apis to get this done. A sample code will do good. 2. Recommended Approach for In-App Model Customization: If SNSoundClassifier doesn’t support fine-tuning, would transfer learning on models like MobileNetV2, YAMNet, OpenL3, or FastViT be more suitable? Given these models (SNSoundClassifier, MobileNetV2, YAMNet, OpenL3, FastViT), which one would be best for accuracy and performance/efficiency on iOS? I aim to maintain real-time performance without sacrificing battery life. Also it is important to see architecture retention and accuracy after conversion to CoreML model. 3. Cost-Effective Backend Setup for Training: Mac EC2 instances on AWS have a 24-hour minimum billing, which can become expensive for limited user requests. Are there better alternatives for deploying and training models on user request when s/he uploads files (training data)? 4. TensorFlow vs PyTorch: Between TensorFlow and PyTorch, which framework would you recommend for iOS Core ML integration? TensorFlow Lite offers mobile-optimized models, but I’m also curious about PyTorch’s performance when converted to Core ML. 5. Metrics: Metrics I have in mind while picking the model are these: Publisher, Accuracy, Fine-Tuning capability, Real-Time/Live use, Suitability of iPhone 16, Architectural retention after coreML conversion, Reasons for unsuitability, Recommended use case. Any insights or recommended approaches would be greatly appreciated. Thanks in advance!
6
1
718
Dec ’24
Does iPhone 15 Pro Use a Single Microphone or Multiple Microphones for Voice and Sound Recognition?
Hello, I have a question regarding the voice and sound recognition features on the iPhone 15 Pro. The iPhone 15 Pro is equipped with four microphones, and I understand that for features like Apple’s sound recognition and when invoking Siri, the microphone(s) must always be active. My question is whether the device uses a single microphone (mono channel) for these functions or if multiple microphones are activated simultaneously. I would appreciate clarification on how the microphones are utilized in sound and voice recognition features. Thank you for your assistance. Best regards.
1
0
519
Oct ’24
Optimizing YOLOv8 for Real-Time Object Detection in a Specific Screen Area
I’m working on real-time object detection using YOLOv8, but I only need to detect objects in approximately 40% of the screen area. Is it possible to limit the captureOut method to focus solely on that specific region of the screen? If this isn’t feasible, I’m considering an approach where the full-screen pixel buffer is captured and then cropped to the target area before running detection. However, I’m concerned about how this might affect real-time performance. I’d appreciate any insights on how to maintain real-time performance or suggestions for better alternatives. Thank you!
2
0
458
Oct ’24
Urgent Issue with SoundAnalysis in iOS 18 - Critical Background Permissions Error
We are experiencing a major issue with the native .version1 of the SoundAnalysis framework in iOS 18, which has led to all our user not having recordings. Our core feature relies heavily on sound analysis in the background, and it previously worked flawlessly in prior iOS versions. However, in the new iOS 18, sound analysis stops working in the background, triggering a critical warning. Details of the issue: We are using SoundAnalysis to analyze background sounds and have enabled the necessary background permissions. We are using the latest XCode A warning now appears, and sound analysis fails in the background. Below is the warning message we are encountering: Warning Message: Execution of the command buffer was aborted due to an error during execution. Insufficient Permission (to submit GPU work from background) [Espresso::handle_ex_plan] exception=Espresso exception: "Generic error": Insufficient Permission (to submit GPU work from background) (00000006:kIOGPUCommandBufferCallbackErrorBackgroundExecutionNotPermitted); code=7 status=-1 Unable to compute the prediction using a neural network model. It can be an invalid input data or broken/unsupported model (error code: -1). CoreML prediction failed with Error Domain=com.apple.CoreML Code=0 "Failed to evaluate model 0 in pipeline" UserInfo={NSLocalizedDescription=Failed to evaluate model 0 in pipeline, NSUnderlyingError=0x30330e910 {Error Domain=com.apple.CoreML Code=0 "Failed to evaluate model 1 in pipeline" UserInfo={NSLocalizedDescription=Failed to evaluate model 1 in pipeline, NSUnderlyingError=0x303307840 {Error Domain=com.apple.CoreML Code=0 "Unable to compute the prediction using a neural network model. It can be an invalid input data or broken/unsupported model (error code: -1)." UserInfo={NSLocalizedDescription=Unable to compute the prediction using a neural network model. It can be an invalid input data or broken/unsupported model (error code: -1).}}}}} We urgently need guidance or a fix for this, as our application’s main functionality is severely impacted by this background permission error. Please let us know the next steps or if this is a known issue with iOS 18.
12
11
1.6k
Dec ’24
Code with Swift Assist
Hello, I would like to inquire about the release date of Swift Assist’s beta version. Apple has stated that it will be released later this year, but they have not provided a specific date or time. Could you please provide information on the beta version’s release date? Additionally, is there a trial version available? If so, when was it released? Thank you for your assistance.
1
1
1.8k
Sep ’24
MLTensor computation took more time than expected.
func testMLTensor() { let t1 = MLTensor(shape: [2000, 1], scalars: [Float](repeating: Float.random(in: 0.0...10.0), count: 2000), scalarType: Float.self) let t2 = MLTensor(shape: [1, 3000], scalars: [Float](repeating: Float.random(in: 0.0...10.0), count: 3000), scalarType: Float.self) for _ in 0...50 { let t = Date() let x = (t1 * t2) print("MLTensor", t.timeIntervalSinceNow * 1000, "ms") } } testMLTensor() The above code took more time than expected, especially in the early stage of iteration.
1
0
578
Aug ’24
MLTensor computation took more time than expected.
func testMLTensor() { let t1 = MLTensor(shape: [2000, 1], scalars: [Float](repeating: Float.random(in: 0.0...10.0), count: 2000), scalarType: Float.self) let t2 = MLTensor(shape: [1, 3000], scalars: [Float](repeating: Float.random(in: 0.0...10.0), count: 3000), scalarType: Float.self) for _ in 0...50 { let t = Date() let x = (t1 * t2) print("MLTensor", t.timeIntervalSinceNow * 1000, "ms") } } testMLTensor() The above code took more time than expected, especially in the early stage of iteration.
0
0
443
Aug ’24
MLTensor computation took more time than expected.
func testMLTensor() { let t1 = MLTensor(shape: [2000, 1], scalars: [Float](repeating: Float.random(in: 0.0...10.0), count: 2000), scalarType: Float.self) let t2 = MLTensor(shape: [1, 3000], scalars: [Float](repeating: Float.random(in: 0.0...10.0), count: 3000), scalarType: Float.self) for _ in 0...50 { let t = Date() let x = (t1 * t2) print("MLTensor", t.timeIntervalSinceNow * 1000, "ms") } } testMLTensor() The above code took more time than expected, especially in the early stage of iteration.
0
0
398
Aug ’24
iOS 18.1 beta - App crashes at runtime while using Translation.TranslationError in project
I'm trying to cast the error thrown by TranslationSession.translations(from:) as Translation.TranslationError. However, the app crashes at runtime whenever Translation.TranslationError is used in the project. Environment: iOS Version: 18.1 beta Xcode Version: 16 beta yld[14615]: Symbol not found: _$s11Translation0A5ErrorVMa Referenced from: <3426152D-A738-30C1-8F06-47D2C6A1B75B> /private/var/containers/Bundle/Application/043A25BC-E53E-4B28-B71A-C21F77C0D76D/TranslationAPI.app/TranslationAPI.debug.dylib Expected in: /System/Library/Frameworks/Translation.framework/Translation
1
1
925
Aug ’24
Use iPad M1 processor as GPU
Hello, I’m currently working on Tiny ML or ML on Edge using the Google Colab platform. Due to the exhaust of my compute unit’s free usage, I’m being prompted to pay. I’ve been considering leveraging the GPU capabilities of my iPad M1 and Intel-based Mac. Both devices utilize Thunderbolt ports capable of sharing connections up to 30GB/s. Since I’m primarily using a classification model, extensive GPU usage isn’t necessary. I’m looking for assistance or guidance on utilizing the iPad’s processor as an eGPU on my Mac, possibly through an API or Apple technology. Any help would be greatly appreciated!
2
0
908
Sep ’24