Hi,
I am getting the following error in TF on M1 Max when I use custom loss function (that I define myself)
2022-02-14 21:23:44.437000: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:305] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support. 2022-02-14 21:23:44.437119: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:271] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: ) Process Process-82: Traceback (most recent call last): File "/Users/sebtac/miniforge3/lib/python3.9/multiprocessing/process.py", line 315, in _bootstrap self.run() File "/Users/sebtac/miniforge3/lib/python3.9/multiprocessing/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/Users/sebtac/Documents/executor_metal.py", line 892, in executor history=model.fit(train_data, File "/Users/sebtac/miniforge3/lib/python3.9/site-packages/keras/utils/traceback_utils.py", line 67, in error_handler raise e.with_traceback(filtered_tb) from None File "/Users/sebtac/miniforge3/lib/python3.9/site-packages/tensorflow/python/framework/ops.py", line 7107, in raise_from_not_ok_status raise core._status_to_exception(e) from None # pylint: disable=protected-access tensorflow.python.framework.errors_impl.InvalidArgumentError: Can not squeeze dim[0], expected a dimension of 1, got 512 [Op:Squeeze]
Custom function: def my_rmse(y_true, y_pred): error = y_true-y_pred sqr_error = K.square(error) mean_sqr_error = K.mean(sqr_error) sqrt_mean_sqr_error = K.sqrt(mean_sqr_error) return sqrt_mean_sqr_error
model.compile(optimizer=optimizer,loss=my_rmse,run_eagerly=True) #model.compile(optimizer=optimizer,loss="mae",run_eagerly=True)
Additional Details: -same does not happen when I use built-in functions
- 512 is the Batch size and batching works fine without custom loss function
- it works well when I set batch to 1
- it works well on non M1 MACs
- I run the model from within microprocessing process