Normally when a model is instantiated multiple times, it gets different weights each time (important for statistical significance testing). The current version of TF-Metal (0.4) seems to cache the model weights or something, resulting in the weights being equal for multiple model instantiations:
from tensorflow.keras import Sequential, layers
def get_model():
model = Sequential()
model.add(layers.Dense(5, activation='relu', input_shape=(4, 4)))
return model
if __name__ == "__main__":
model1 = get_model()
model2 = get_model()
print(model1.weights[0] - model2.weights[0])
Response without TF-Metal (as desired, weights for the two models are different):
tf.Tensor(
[[ 1.0636648 -0.10816181 0.8423695 1.3752697 0.38691664]
[ 0.2402662 0.38139135 -0.19254395 -0.24551326 0.13166189]
[-0.24854952 1.3374841 0.9716329 -0.21249878 -0.34604508]
[ 0.5040202 0.120031 0.13515717 -0.40721053 0.29544616]], shape=(4, 5), dtype=float32)
Response with TF-Metal (weights are all equal):
tf.Tensor(
[[0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0.]], shape=(4, 5), dtype=float32)
Any ideas on the root cause or timeline for a fix?
Post
Replies
Boosts
Views
Activity
Hi all, I was experimenting with the tf-metal (v0.4) framework and noticed some odd interactions with the tensorflow_probability package:
import tensorflow as tf
import numpy as np
from tensorflow_probability.python.stats import percentile
if __name__ == "__main__":
data = np.array([0.12941672, 0.22039098, 0.33956015, 0.3787993, 0.5329178, 0.62175393, 0.5906472, 0.97234255, 0.7709932, 0.76639813, 1.0468946, 1.1515584, 1.0470238, 1.1140094, 1.2083299, 1.051311, 1.0782655, 1.0192754, 0.8690998, 0.9439713, 0.6992503, 0.7017522, 0.6524739, 0.536425, 0.47863948, 0.46657538, 0.45757294, 0.2988146, 0.19273241, 0.1494804, 0., 0.], dtype=np.float64)
data16 = tf.convert_to_tensor(data, dtype=tf.float16)
data32 = tf.convert_to_tensor(data, dtype=tf.float32)
data64 = tf.convert_to_tensor(data, dtype=tf.float64)
p = percentile(data, 99, keepdims=True, interpolation="lower")
print(f"Percentile based on Numpy array (float64): {p}")
p = percentile(data16, 99, keepdims=True, interpolation="lower")
print(f"Percentile based on TF (float16): {p}")
p = percentile(data32, 99, keepdims=True, interpolation="lower")
print(f"Percentile based on TF (float32): {p}")
p = percentile(data64, 99, keepdims=True, interpolation="lower")
print(f"Percentile based on TF (float64): {p}")
This results in:
Percentile based on Numpy array (float64): [1.1515584]
Percentile based on TF (float16): [1.151]
Percentile based on TF (float32): [-0.]
Percentile based on TF (float64): [1.1515584]
The float32 value here is obviously corrupted, whereas the others are fine (presumably because only float32 is sent to the gpu?). When I uninstall tf-metal the float32 values are computed correctly. Any thoughts on when a fix might be available? Also, is there any timeline for supporting float16 on gpu?