I am having the same issue -- 4/14/2023 -- Not to mention that I still get the warning to use the from keras.optimizers import Adam as AdamLegacy to make my binary classifier work. Is there any update I should be aware of?
Post
Replies
Boosts
Views
Activity
Als0 I don't see a distribution for tensorflow-metal==0.12.0 (latest version is 0.8.0) where can I get it?
I just upgraded to Sonoma the latest Mac OS and my fit method start giving me weird warning and errors and the training is very slow! Unusable
2023-09-28 10:50:29.393958: I metal_plugin/src/device/metal_device.cc:1154] Metal device set to: Apple M1 Pro
2023-09-28 10:50:29.394708: I metal_plugin/src/device/metal_device.cc:296] systemMemory: 16.00 GB
2023-09-28 10:50:29.394896: I metal_plugin/src/device/metal_device.cc:313] maxCacheSize: 5.33 GB
2023-09-28 10:50:29.395661: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:303] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.
2023-09-28 10:50:29.395694: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:269] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: )
Training a brand new model - [13, 23, 42, 70]-Long
2023-09-28 10:50:35.795666: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:114] Plugin optimizer for device_type GPU is enabled.
2023-09-28 10:50:35.803057: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:114] Plugin optimizer for device_type GPU is enabled.
Epoch 1/10
2023-09-28 10:50:37.043038: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:114] Plugin optimizer for device_type GPU is enabled.
I get the following error after following the instructions from App Metal install website: Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.
It seems like there is no consensus as to how to resolve this. I have upgraded my OS to Sonoma on Mac - latest OS to date and it seems like my tensorflow needed to be updated along with all dependent libraries and at that point it runs EXTREMELY slow. I have been searching all over to find a solution but there isnt' one that I was able to find. Any help / direction from you would be greatly appreciated
After downgrading to legacy version the speed came back and works again. I then tried to upgrade back to the latest libraries and plug iin and I am getting the following warnings: 2023-09-28 22:02:25.250960: I metal_plugin/src/device/metal_device.cc:1154] Metal device set to: Apple M1 Pro
2023-09-28 22:02:25.250990: I metal_plugin/src/device/metal_device.cc:296] systemMemory: 16.00 GB
2023-09-28 22:02:25.250999: I metal_plugin/src/device/metal_device.cc:313] maxCacheSize: 5.33 GB
2023-09-28 22:02:25.251066: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:303] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.
2023-09-28 22:02:25.251101: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:269] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: )
Every epoc -- or every Fit or Predict displays this line now: 2023-09-28 22:05:11.530865: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:114] Plugin optimizer for device_type GPU is enabled.
Any idea what's going on and to get things back to optimal performance? Its is still about 5X the speed of the legacy versions
So - after downgrading to an older version of TensorFlow-Metal and TensorFlow-MacOs it works back in the desired speed (although a few annoying warning messages)
conda install -c apple tensorflow-deps==2.9.0
python -m pip install tensorflow-macos==2.9.0
python -m pip install tensorflow-metal==0.5.0
That did the trick
But the annoying warning messages are as follows - if anyone has an idea how to fix I am running on Mac OS Sonoma
819/819 [==============================] - ETA: 0s - loss: 0.7380 - auc: 0.7107 - prc: 0.63482023-09-28 22:20:35.837423: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:113] Plugin optimizer for device_type GPU is enabled.
loc("mps_select"("(mpsFileLoc): /AppleInternal/Library/BuildRoots/75428952-3aa4-11ee-8b65-46d450270006/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShadersGraph/mpsgraph/MetalPerformanceShadersGraph/Core/Files/MPSGraphUtilities.mm":294:0)): error: 'anec.gain_offset_control' op result #0 must be 4D/5D memref of 16-bit float or 8-bit signed integer or 8-bit unsigned integer values, but got 'memref<1x64x1x1xi1>'
loc("mps_select"("(mpsFileLoc): /AppleInternal/Library/BuildRoots/75428952-3aa4-11ee-8b65-46d450270006/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShadersGraph/mpsgraph/MetalPerformanceShadersGraph/Core/Files/MPSGraphUtilities.mm":294:0)): error: 'anec.gain_offset_control' op result #0 must be 4D/5D memref of 16-bit float or 8-bit signed integer or 8-bit unsigned integer values, but got 'memref<1x64x1x1xi1>'
loc("mps_select"("(mpsFileLoc): /AppleInternal/Library/BuildRoots/75428952-3aa4-11ee-8b65-46d450270006/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShadersGraph/mpsgraph/MetalPerformanceShadersGraph/Core/Files/MPSGraphUtilities.mm":294:0)): error: 'anec.gain_offset_control' op result #0 must be 4D/5D memref of 16-bit float or 8-bit signed integer or 8-bit unsigned integer values, but got 'memref<1x1x1x1999xi1>'
loc("mps_select"("(mpsFileLoc): /AppleInternal/Library/BuildRoots/75428952-3aa4-11ee-8b65-46d450270006/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShadersGraph/mpsgraph/MetalPerformanceShadersGraph/Core/Files/MPSGraphUtilities.mm":294:0)): error: 'anec.gain_offset_control' op result #0 must be 4D/5D memref of 16-bit float or 8-bit signed integer or 8-bit unsigned integer values, but got 'memref<1x40x1x1xi1>'
loc("mps_select"("(mpsFileLoc): /AppleInternal/Library/BuildRoots/75428952-3aa4-11ee-8b65-46d450270006/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShadersGraph/mpsgraph/MetalPerformanceShadersGraph/Core/Files/MPSGraphUtilities.mm":294:0)): error: 'anec.gain_offset_control' op result #0 must be 4D/5D memref of 16-bit float or 8-bit signed integer or 8-bit unsigned integer values, but got 'memref<1x40x1x1xi1>'
October 2023 and the issue is still there -- after my upgrade to Sonoma OS I can't get my tensorflow metal to behave well with batch-size of 128 -- I used to run at 64 just fine (it was speedy) and now with higher batches I do see some (not great) performance improvements but the model overfits with large batch sizes.
I have read the blogs for all sort of suggestions, reverting back to older version of TF for MAC (I don't want to do that).
One suggestion I saw from some postings is to disable GPU alltogether -- anyone had any succces with that?