Post

Replies

Boosts

Views

Activity

Reply to Wrong results when using tensor flow-metal
I've been running into a similar issue with my model training, and I found that it depends on the model that I used (look at the answer I've added in the link). MobileNetV3Small model performs poorly, but a custom image classification model that I define works great on the same dataset. @gtsoukas I wonder if there is any overlap in the layers between your model and mine that could explain this issue.
Dec ’21
Reply to Tensorflow MobileNetV3Small model not training on custom image classification task
The issue seems to be specific to certain types of operations/layers in Tensorflow, and specifically with respect to the validation accuracy (similar to this issue). When I build my own custom model with convolutions like so: model = Sequential([ layers.Rescaling(1./255, input_shape=(IMG_HEIGHT, IMG_WIDTH, 1)), layers.Conv2D(16, 1, padding='same', activation='relu'), layers.MaxPooling2D(), layers.Conv2D(32, 1, padding='same', activation='relu'), layers.MaxPooling2D(), layers.Conv2D(64, 1, padding='same', activation='relu'), layers.MaxPooling2D(), layers.Dropout(0.2), layers.Flatten(), layers.Dense(128, activation='relu'), layers.Dense(num_classes), layers.Softmax() ]) training proceeds as expected, with a high validation accuracy as well. Below is the output for the above model: Found 210454 files belonging to 1098 classes. Metal device set to: Apple M1 Pro systemMemory: 32.00 GB maxCacheSize: 10.67 GB 2021-12-21 12:27:24.005759: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:305] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support. 2021-12-21 12:27:24.006206: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:271] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: <undefined>) Found 210454 files belonging to 1098 classes. Using 31568 files for validation. 2021-12-21 12:27:26.965648: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:185] None of the MLIR Optimization Passes are enabled (registered 2) 2021-12-21 12:27:26.968717: W tensorflow/core/platform/profile_utils/cpu_utils.cc:128] Failed to get CPU frequency: 0 Hz 2021-12-21 12:27:26.969214: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled. 1645/1645 [==============================] - ETA: 0s - loss: 2.1246 - sparse_categorical_accuracy: 0.52732021-12-21 12:32:57.475358: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled. 1645/1645 [==============================] - 353s 214ms/step - loss: 2.1246 - sparse_categorical_accuracy: 0.5273 - val_loss: 1.3041 - val_sparse_categorical_accuracy: 0.6558 2021-12-21 12:33:19.600146: W tensorflow/python/util/util.cc:348] Sets are not currently considered sequences, but this may change in the future, so consider avoiding using them. However, the very same code with the MobileNetV3Small model (instead of my custom model) produces the following output: Found 210454 files belonging to 1098 classes. Metal device set to: Apple M1 Pro systemMemory: 32.00 GB maxCacheSize: 10.67 GB 2021-12-21 12:34:46.754598: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:305] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support. 2021-12-21 12:34:46.754793: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:271] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: <undefined>) Found 210454 files belonging to 1098 classes. Using 31568 files for validation. 2021-12-21 12:34:49.742015: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:185] None of the MLIR Optimization Passes are enabled (registered 2) 2021-12-21 12:34:49.747397: W tensorflow/core/platform/profile_utils/cpu_utils.cc:128] Failed to get CPU frequency: 0 Hz 2021-12-21 12:34:49.747606: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled. 1645/1645 [==============================] - ETA: 0s - loss: 2.4072 - sparse_categorical_accuracy: 0.46722021-12-21 12:41:28.137948: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled. 1645/1645 [==============================] - 415s 252ms/step - loss: 2.4072 - sparse_categorical_accuracy: 0.4672 - val_loss: 21.6091 - val_sparse_categorical_accuracy: 0.0131 2021-12-21 12:41:46.017580: W tensorflow/python/util/util.cc:348] Sets are not currently considered sequences, but this may change in the future, so consider avoiding using them. /Users/venkat/miniforge3/envs/tf-metal/lib/python3.9/site-packages/keras/utils/generic_utils.py:494: CustomMaskWarning: Custom mask layers require a config and must override get_config. When loading, the custom mask layer must be passed to the custom_objects argument. warnings.warn('Custom mask layers require a config and must override ' The validation loss/accuracy is hilariously bad, and I find that the model constantly predicts the same class. My guess is that MobileNetV3Small seems to contain some operations/layers that don't work well with tensorflow-metal for whatever reason, and only Apple Engineers can fix this problem at a low level.
Dec ’21