09jtip’s Profile | Apple Developer Forums

Memory leaks when using GPU

I haven't used the GPU implementation for over a year now due to constant issues (I use tf.config.set_visible_devices([], 'GPU') to use CPU only. I have also had a couple of issues with model convergence using GPU, however this issue seems more prominent, and possibly unrelated. Here is an example of code that causes a memory leak using GPU (I cannot link the dataset, but it is called: Text classification documentation, by TANISHQ DUBLISH on Kaggle. import pandas as pd import numpy as np import matplotlib.pyplot as plt import tensorflow as tf df = pd.read_csv('df_file.csv') df.head() train_df = df.sample(frac=0.7, random_state=42) val_df = df.drop(train_df.index).sample(frac=0.5, random_state=42) test_df = df.drop(train_df.index).drop(val_df.index) train_dataset = tf.data.Dataset.from_tensor_slices((train_df['Text'].values, train_df['Label'].values)).batch(32).prefetch(tf.data.AUTOTUNE) val_dataset = tf.data.Dataset.from_tensor_slices((val_df['Text'].values, val_df['Label'].values)).batch(32).prefetch(tf.data.AUTOTUNE) test_dataset = tf.data.Dataset.from_tensor_slices((test_df['Text'].values, test_df['Label'].values)).batch(32).prefetch(tf.data.AUTOTUNE) text_vectorizer = tf.keras.layers.TextVectorization(max_tokens=100_000, output_mode='int', output_sequence_length=1000, pad_to_max_tokens=True) text_vectorizer.adapt(train_df['Text'].values) embedding = tf.keras.layers.Embedding(input_dim=len(text_vectorizer.get_vocabulary()), output_dim=128, input_length=1000) inputs = tf.keras.layers.Input(shape=[], dtype=tf.string) x = text_vectorizer(inputs) x = embedding(x) x = tf.keras.layers.LSTM(64)(x) outputs = tf.keras.layers.Dense(5, activation='softmax')(x) model_2 = tf.keras.Model(inputs, outputs, name='model_2_lstm') model_2.compile(loss=tf.keras.losses.SparseCategoricalCrossentropy(), optimizer=tf.keras.optimizers.legacy.Adam(), metrics=['accuracy']) model_2_history = model_2.fit(train_dataset, epochs=50, validation_data=val_dataset, callbacks=[ tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=3, restore_best_weights=True), tf.keras.callbacks.ModelCheckpoint(model_2.name, save_best_only=True), tf.keras.callbacks.ReduceLROnPlateau(monitor='val_loss', patience=5, verbose=1) ])

Machine Learning & AI General tensorflow-metal

578

Feb ’24

Tensorflow-metal giving garbage results with GAN

I recently wrote some code for a basic GAN (I am learning about neural networks, so not an expert), and got very strange results. Unable to debug, I tested someone else's code that I know works, and still got the same results. When running a GAN to generate digits (from the MNIST dataset) the images produced each epoch are identical to each other, and don't resemble digits at all. An example of the images produced can be seen below. Rerunning the same code on Google Colab, and on my machine locally (with standard tensorflow, i.e. without the metal plugin) gives expected results of images resembling digits. The code is used to test this can be found here: https://github.com/PacktPublishing/Deep-Learning-with-TensorFlow-2-and-Keras/blob/master/Chapter%206/VanillaGAN.ipynb I am using these versions of relevant software: tensorflow-metal 0.5.0; tensorflow-macos 2.9.2; macOS Monterey 12.3; I would be grateful if Apple engineers could advise, or give a timeframe for a solution please.

Machine Learning & AI General tensorflow-metal

961

Jun ’22

09jtip

Post

Replies

Boosts

Views

Activity