Post

Replies

Boosts

Views

Activity

Reply to Missing librairies in /usr/lib on Big Sur?
The problem as caused by version 12.0.0 of ld which lived in my Anaconda virtual environment. ld 13.1.6 did not have the issue. % ld -v @(#)PROGRAM:ld PROJECT:ld64-764 BUILD 11:29:01 May 17 2022 configured to support archs: armv6 armv7 armv7s arm64 arm64e arm64_32 i386 x86_64 x86_64h armv6m armv7k armv7m armv7em LTO support using: LLVM version 13.1.6, (clang-1316.0.21.2.5) (static support for 28, runtime is 28) TAPI support using: Apple TAPI version 13.1.6 (tapi-1316.0.7.3)
Jun ’22
Reply to Missing librairies in /usr/lib on Big Sur?
I am getting this error trying to compile AI-Feynman ld: unsupported tapi file type '!tapi-tbd' in YAML file '/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/lib/libSystem.tbd' for architecture x86_64 I tried to generate a new .tbd file from libSystem.dylib with 'tapi stubify ...' but I can't locate the libSystem.B.dylib file. The other .dylibs in XCode are not the right ones. % locate libSystem.B.dylib /Applications/Xcode.app/Contents/Developer/Platforms/AppleTVOS.platform/Library/Developer/CoreSimulator/Profiles/Runtimes/tvOS.simruntime/Contents/Resources/RuntimeRoot/usr/lib/libSystem.B.dylib /Applications/Xcode.app/Contents/Developer/Platforms/WatchOS.platform/Library/Developer/CoreSimulator/Profiles/Runtimes/watchOS.simruntime/Contents/Resources/RuntimeRoot/usr/lib/libSystem.B.dylib /Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/Library/Developer/CoreSimulator/Profiles/Runtimes/iOS.simruntime/Contents/Resources/RuntimeRoot/usr/lib/libSystem.B.dylib Any ideas on how to generate a replacement .tbd file from a 'virtual' shared library which lives in a cache? % otool -L /Applications/Xcode.app/Contents/Developer/Platforms/AppleTVOS.platform/Library/Developer/CoreSimulator/Profiles/Runtimes/tvOS.simruntime/Contents/Resources/RuntimeRoot/usr/lib/libSystem.dylib /Applications/Xcode.app/Contents/Developer/Platforms/AppleTVOS.platform/Library/Developer/CoreSimulator/Profiles/Runtimes/tvOS.simruntime/Contents/Resources/RuntimeRoot/usr/lib/libSystem.dylib (architecture x86_64): /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1311.100.3) /usr/lib/system/libcache.dylib (compatibility version 1.0.0, current version 85.0.0) /usr/lib/system/libcommonCrypto.dylib (compatibility version 1.0.0, current version 60191.100.1) /usr/lib/system/libcompiler_rt.dylib (compatibility version 1.0.0, current version 103.1.0) /usr/lib/system/libcopyfile.dylib (compatibility version 1.0.0, current version 1.0.0) /usr/lib/system/libcorecrypto.dylib (compatibility version 1.0.0, current version 1218.100.47) /usr/lib/system/libdispatch.dylib (compatibility version 1.0.0, current version 1325.100.36) /usr/lib/system/libdyld.dylib (compatibility version 1.0.0, current version 1.0.0) /usr/lib/system/libmacho.dylib (compatibility version 1.0.0, current version 994.0.0) /usr/lib/system/libremovefile.dylib (compatibility version 1.0.0, current version 60.0.0) /usr/lib/system/libsystem_asl.dylib (compatibility version 1.0.0, current version 392.100.2) /usr/lib/system/libsystem_blocks.dylib (compatibility version 1.0.0, current version 79.1.0) /usr/lib/system/libsystem_c.dylib (compatibility version 1.0.0, current version 1507.100.9) /usr/lib/system/libsystem_collections.dylib (compatibility version 1.0.0, current version 1507.100.9) /usr/lib/system/libsystem_configuration.dylib (compatibility version 1.0.0, current version 1163.100.19) /usr/lib/system/libsystem_containermanager.dylib (compatibility version 1.0.0, current version 1.0.0) /usr/lib/system/libsystem_coreservices.dylib (compatibility version 1.0.0, current version 133.0.0) /usr/lib/system/libsystem_darwin.dylib (compatibility version 1.0.0, current version 1.0.0) /usr/lib/system/libsystem_dnssd.dylib (compatibility version 1.0.0, current version 1557.103.1) /usr/lib/system/libsystem_featureflags.dylib (compatibility version 1.0.0, current version 56.0.0) /usr/lib/system/libsystem_info.dylib (compatibility version 1.0.0, current version 1.0.0) /usr/lib/system/libsystem_m.dylib (compatibility version 1.0.0, current version 3204.80.2) /usr/lib/system/libsystem_malloc.dylib (compatibility version 1.0.0, current version 374.100.5) /usr/lib/system/libsystem_networkextension.dylib (compatibility version 1.0.0, current version 1.0.0) /usr/lib/system/libsystem_notify.dylib (compatibility version 1.0.0, current version 301.0.0) /usr/lib/system/libsystem_product_info_filter.dylib (compatibility version 1.0.0, current version 10.0.0) /usr/lib/system/libsystem_sandbox.dylib (compatibility version 1.0.0, current version 1657.103.1) /usr/lib/system/libsystem_sim_kernel.dylib (compatibility version 1.0.0, current version 238.100.1) /usr/lib/system/libsystem_sim_platform.dylib (compatibility version 1.0.0, current version 238.100.1) /usr/lib/system/libsystem_sim_pthread.dylib (compatibility version 1.0.0, current version 238.100.1) /usr/lib/system/libsystem_trace.dylib (compatibility version 1.0.0, current version 1375.100.9) /usr/lib/system/libunwind.dylib (compatibility version 1.0.0, current version 202.2.0) ... (base) davidlaxer@x86_64-apple-darwin13 iot-inspector-client % ls -l /usr/lib/system total 1720 drwxr-xr-x 4 root wheel 128 May 9 14:30 introspection -rwxr-xr-x 1 root wheel 1617536 May 9 14:30 libsystem_kernel.dylib -rwxr-xr-x 1 root wheel 512560 May 9 14:30 libsystem_platform.dylib -rwxr-xr-x 1 root wheel 656656 May 9 14:30 libsystem_pthread.dylib -rwxr-xr-x 1 root wheel 150080 May 9 14:30 wordexp-helper Any ideas on what the linker doesn't like about file type '!tapi-tbd' in YAML file '/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/lib/libSystem.tbd' for architecture x86_64
Jun ’22
Reply to ld: unsupported tapi file type '!tapi-tbd' in YAML file '/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/lib/libSystem.tbd' for architecture x86_64
I tried uninstalling and reinstalling CommandLineTools % ls -ag /Library/Developer total 0 drwxr-xr-x 4 wheel 128 May 31 18:18 . drwxr-xr-x 72 wheel 2304 May 17 12:01 .. drwxr-xr-x 6 wheel 192 May 31 18:17 CommandLineTools drwxr-xr-x 8 admin 256 May 17 01:44 PrivateFrameworks % xcrun --show-sdk-platform-path xcrun: error: unable to lookup item 'PlatformPath' from command line tools installation xcrun: error: unable to lookup item 'PlatformPath' in SDK '/Library/Developer/CommandLineTools/SDKs/MacOSX12.3.sdk' (AI-Feynman) davidlaxer@x86_64-apple-darwin13 AI-Feynman % xcode-select -p /Library/Developer/CommandLineTools (AI-Feynman) davidlaxer@x86_64-apple-darwin13 AI-Feynman % xcrun --show-sdk-path --sdk macosx /Library/Developer/CommandLineTools/SDKs/MacOSX12.3.sdk (AI-Feynman) davidlaxer@x86_64-apple-darwin13 AI-Feynman % xcrun --sdk macosx10.13 --show-sdk-path xcrun: error: SDK "macosx10.13" cannot be located xcrun: error: SDK "macosx10.13" cannot be located xcrun: error: unable to lookup item 'Path' in SDK 'macosx10.13' Why the reference to macosx10.13? How do I delete the old SDK reference?
Jun ’22
Reply to Deep Learning Chapter 10: Advanced use of recurrent neural networks not Using GPU
On Google Colab with a GPU each epoch is ~383 seconds (not 32417seconds): `--2022-03-29 16:58:04-- https://s3.amazonaws.com/keras-datasets/jena_climate_2009_2016.csv.zip Resolving s3.amazonaws.com (s3.amazonaws.com)... 52.217.132.216 Connecting to s3.amazonaws.com (s3.amazonaws.com)|52.217.132.216|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 13565642 (13M) [application/zip] Saving to: ‘jena_climate_2009_2016.csv.zip’ jena_climate_2009_2 100%[===================>] 12.94M 61.8MB/s in 0.2s 2022-03-29 16:58:04 (61.8 MB/s) - ‘jena_climate_2009_2016.csv.zip’ saved [13565642/13565642] Archive: jena_climate_2009_2016.csv.zip inflating: jena_climate_2009_2016.csv inflating: __MACOSX/._jena_climate_2009_2016.csv ['"Date Time"', '"p (mbar)"', '"T (degC)"', '"Tpot (K)"', '"Tdew (degC)"', '"rh (%)"', '"VPmax (mbar)"', '"VPact (mbar)"', '"VPdef (mbar)"', '"sh (g/kg)"', '"H2OC (mmol/mol)"', '"rho (g/m**3)"', '"wv (m/s)"', '"max. wv (m/s)"', '"wd (deg)"'] 420451 num_train_samples: 210225 num_val_samples: 105112 num_test_samples: 105114 WARNING:tensorflow:Layer lstm will not use cuDNN kernels since it doesn't meet the criteria. It will use a generic GPU kernel as fallback when running on GPU. Epoch 1/50 819/819 [==============================] - 383s 461ms/step - loss: 147.6203 - mae: 10.0559 - val_loss: 137.3602 - val_mae: 9.6686 Epoch 2/50 118/819 [===>....`
Mar ’22
Reply to Some resource has been exhausted. For example, this error might be raised if a per-user quota is exhausted, or perhaps the entire file system is out of space. @@__init__ 2 root error(s) found. (0) RESOURCE_EXHAUSTED: OOM when allocating
The exception is generated building a list of document vectors from input documents not in model training: E.g. - document_vectors.append(self.embed(train_corpus[current:current + batch_size])) The python 3.8 process grows in memory to 100GB and then generates the OOM exception. def _embed_documents(self, train_corpus): self._check_import_status() self._check_model_status() # embed documents batch_size = 5 document_vectors = [] current = 0 batches = int(len(train_corpus) / batch_size) extra = len(train_corpus) % batch_size for ind in range(0, batches): try: __**document_vectors.append(self.embed(train_corpus[current:current + batch_size]))**__ except Exception as e: print (e.__doc__) print (e.message) current += batch_size if extra > 0: document_vectors.append(self.embed(train_corpus[current:current + extra])) document_vectors = self._l2_normalize(np.array(np.vstack(document_vectors))) return document_vectors
Dec ’21
Reply to [MPSGraph adamUpdateWithLearningRateTensor:beta1Tensor:beta2Tensor:epsilonTensor:beta1PowerTensor:beta2PowerTensor:valuesTensor:momentumTensor:velocityTensor:gradientTensor:name:]: unrecognized selector sent to instance 0x600000eede10
This code crashes with the 'adam' optimzer. It does work with 'SGD'. I am running Monterey 12.1 beta, and the latest versions of tensorflow-macos and tensorflow-metal from pypi. import tensorflow as tf mnist = tf.keras.datasets.mnist (x_train, y_train), (x_test, y_test) = mnist.load_data() x_train, x_test = x_train / 255.0, x_test / 255.0 model = tf.keras.models.Sequential([ tf.keras.layers.Flatten(input_shape=(28, 28)), tf.keras.layers.Dense(128, activation='relu'), tf.keras.layers.Dropout(0.2), tf.keras.layers.Dense(10) ]) predictions = model(x_train[:1]).numpy() tf.nn.softmax(predictions).numpy() loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True) loss_fn(y_train[:1], predictions).numpy() model.compile(optimizer = 'adam', loss = loss_fn) model.fit(x_train, y_train, epochs=100)
Nov ’21
Reply to M1 native Python is crashing.
Hi, Your example runs for me on Monterey 12.0.1 with Python 3.8 ... if I replace the ADAM optimizer with SGD. model.compile( loss='sparse_categorical_crossentropy', optimizer=tf.keras.optimizers.SGD(0.001), metrics=['accuracy'], ) I've noticed ADAM crash the session. Metal device set to: AMD Radeon Pro 5700 XT 2021-10-25 12:01:51.733970: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. 2021-10-25 12:01:51.734526: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:305] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support. 2021-10-25 12:01:51.734764: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:271] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: <undefined>) 2021-10-25 12:01:51.902618: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:305] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support. 2021-10-25 12:01:51.902647: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:271] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: <undefined>) 2021-10-25 12:01:52.021880: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled. 2021-10-25 12:01:52.035650: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled. 2021-10-25 12:01:52.081019: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled. 2021-10-25 12:01:52.099696: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled. 2021-10-25 12:01:52.211089: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled. 2021-10-25 12:01:52.229341: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled. 2021-10-25 12:01:52.237014: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled. 2021-10-25 12:01:52.261855: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled. 2021-10-25 12:01:52.279544: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:185] None of the MLIR Optimization Passes are enabled (registered 2) 2021-10-25 12:01:52.304527: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled. 2021-10-25 12:01:52.324218: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled. Train on 469 steps, validate on 79 steps Epoch 1/12 469/469 [==============================] - ETA: 0s - batch: 234.0000 - size: 1.0000 - loss: 2.2622 - accuracy: 0.1993 /Users/davidlaxer/tensorflow-metal/lib/python3.8/site-packages/keras/engine/training.py:2470: UserWarning: `Model.state_updates` will be removed in a future version. This property should not be used in TensorFlow 2.0, as `updates` are applied automatically. warnings.warn('`Model.state_updates` will be removed in a future version. ' 2021-10-25 12:02:06.665054: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled. 469/469 [==============================] - 15s 19ms/step - batch: 234.0000 - size: 1.0000 - loss: 2.2622 - accuracy: 0.1993 - val_loss: 2.2074 - val_accuracy: 0.4665 Epoch 2/12 469/469 [==============================] - 11s 20ms/step - batch: 234.0000 - size: 1.0000 - loss: 2.1208 - accuracy: 0.3812 - val_loss: 1.9072 - val_accuracy: 0.6792 Epoch 3/12 469/469 [==============================] - 11s 21ms/step - batch: 234.0000 - size: 1.0000 - loss: 1.6169 - accuracy: 0.5601 - val_loss: 1.0289 - val_accuracy: 0.8151 Epoch 4/12 469/469 [==============================] - 12s 22ms/step - batch: 234.0000 - size: 1.0000 - loss: 1.0248 - accuracy: 0.6935 - val_loss: 0.5984 - val_accuracy: 0.8613 Epoch 5/12 469/469 [==============================] - 12s 23ms/step - batch: 234.0000 - size: 1.0000 - loss: 0.7831 - accuracy: 0.7570 - val_loss: 0.4718 - val_accuracy: 0.8799 Epoch 6/12 469/469 [==============================] - 13s 24ms/step - batch: 234.0000 - size: 1.0000 - loss: 0.6629 - accuracy: 0.7937 - val_loss: 0.4055 - val_accuracy: 0.8929 Epoch 7/12 469/469 [==============================] - 13s 24ms/step - batch: 234.0000 - size: 1.0000 - loss: 0.6024 - accuracy: 0.8123 - val_loss: 0.3660 - val_accuracy: 0.9007 Epoch 8/12 469/469 [==============================] - 13s 25ms/step - batch: 234.0000 - size: 1.0000 - loss: 0.5541 - accuracy: 0.8301 - val_loss: 0.3380 - val_accuracy: 0.9073 Epoch 9/12 469/469 [==============================] - 13s 25ms/step - batch: 234.0000 - size: 1.0000 - loss: 0.5244 - accuracy: 0.8397 - val_loss: 0.3181 - val_accuracy: 0.9121 Epoch 10/12 469/469 [==============================] - 13s 24ms/step - batch: 234.0000 - size: 1.0000 - loss: 0.4910 - accuracy: 0.8500 - val_loss: 0.2988 - val_accuracy: 0.9161 Epoch 11/12 469/469 [==============================] - 13s 25ms/step - batch: 234.0000 - size: 1.0000 - loss: 0.4683 - accuracy: 0.8570 - val_loss: 0.2857 - val_accuracy: 0.9186 Epoch 12/12 469/469 [==============================] - 14s 25ms/step - batch: 234.0000 - size: 1.0000 - loss: 0.4562 - accuracy: 0.8600 - val_loss: 0.2736 - val_accuracy: 0.9207 [1]: <keras.callbacks.History at 0x7f8758fd9310> [ ]: ​
Oct ’21
Reply to tensorflow-macos slow (Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.)
On my iMac 27" with Monterey 12.0.1 it crashes with the GPU in tensorflow-metal: % python muzero.py 2021-10-21 08:36:21.088556: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. Metal device set to: AMD Radeon Pro 5700 XT systemMemory: 128.00 GB maxCacheSize: 7.99 GB 2021-10-21 08:36:21.089347: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:305] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support. 2021-10-21 08:36:21.089966: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:271] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: <undefined>) 2021-10-21 08:36:21.753689: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:185] None of the MLIR Optimization Passes are enabled (registered 2) 2021-10-21 08:36:21.759239: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled. 2021-10-21 08:36:34.888 python[14296:730686] -[MPSGraph adamUpdateWithLearningRateTensor:beta1Tensor:beta2Tensor:epsilonTensor:beta1PowerTensor:beta2PowerTensor:valuesTensor:momentumTensor:velocityTensor:gradientTensor:name:]: unrecognized selector sent to instance 0x600001b26220 zsh: segmentation fault python muzero.py It runs with the CPU. % python --version Python 3.8.5 % pip freeze absl-py==0.12.0 anyio==3.3.2 appnope==0.1.2 argon2-cffi==21.1.0 asttokens==2.0.5 astunparse==1.6.3 attrs==21.2.0 Babel==2.9.1 backcall==0.2.0 bleach==4.1.0 bokeh==2.3.3 cachetools==4.2.4 certifi==2021.5.30 cffi==1.14.6 charset-normalizer==2.0.6 clang==5.0 cloudpickle==2.0.0 colorama==0.4.4 cycler==0.10.0 Cython==0.29.24 debugpy==1.5.0 decorator==5.1.0 defusedxml==0.7.1 dill==0.3.4 distinctipy==1.1.5 dm-tree==0.1.6 dotmap==1.3.24 entrypoints==0.3 executing==0.8.2 flatbuffers==1.12 future==0.18.2 gast==0.4.0 gensim==3.8.3 google-auth==1.35.0 google-auth-oauthlib==0.4.6 google-pasta==0.2.0 googleapis-common-protos==1.53.0 grpcio==1.41.0 gviz-api==1.9.0 gym==0.21.0 h5py==3.1.0 hdbscan==0.8.27 icecream==2.1.1 idna==3.2 importlib-resources==5.2.2 ipykernel==6.4.1 ipython==7.28.0 ipython-genutils==0.2.0 ipywidgets==7.6.5 jedi==0.18.0 Jinja2==3.0.2 joblib==1.1.0 json5==0.9.6 jsonschema==4.0.1 jupyter-client==7.0.6 jupyter-core==4.8.1 jupyter-server==1.11.1 jupyterlab==3.1.18 jupyterlab-pygments==0.1.2 jupyterlab-server==2.8.2 jupyterlab-widgets==1.0.2 keras==2.6.0 Keras-Preprocessing==1.1.2 kiwisolver==1.3.2 llvmlite==0.37.0 Markdown==3.3.4 MarkupSafe==2.0.1 matplotlib==3.4.3 matplotlib-inline==0.1.3 memory-profiler==0.58.0 mistune==0.8.4 nbclassic==0.3.2 nbclient==0.5.4 nbconvert==6.2.0 nbformat==5.1.3 nest-asyncio==1.5.1 nmslib==2.1.1 notebook==6.4.4 numba==0.54.0 numpy==1.20.3 oauthlib==3.1.1 opt-einsum==3.3.0 packaging==21.0 pandas==1.3.3 pandocfilters==1.5.0 parso==0.8.2 pexpect==4.8.0 pickleshare==0.7.5 Pillow==8.3.2 prometheus-client==0.11.0 promise==2.3 prompt-toolkit==3.0.20 protobuf==3.18.1 psutil==5.8.0 ptyprocess==0.7.0 pyasn1==0.4.8 pyasn1-modules==0.2.8 pybind11==2.6.1 pycparser==2.20 Pygments==2.10.0 pynndescent==0.5.4 pyparsing==2.4.7 pyrsistent==0.18.0 python-dateutil==2.8.2 pytz==2021.3 PyYAML==5.4.1 pyzmq==22.3.0 requests==2.26.0 requests-oauthlib==1.3.0 requests-unixsocket==0.2.0 rsa==4.7.2 scikit-learn==1.0 scipy==1.7.1 Send2Trash==1.8.0 six==1.15.0 smart-open==5.2.1 sniffio==1.2.0 tabulate==0.8.9 tensorboard==2.6.0 tensorboard-data-server==0.6.1 tensorboard-plugin-profile==2.5.0 tensorboard-plugin-wit==1.8.0 tensorflow==2.6.0 tensorflow-consciousness==0.1 tensorflow-datasets==4.4.0 tensorflow-estimator==2.6.0 tensorflow-gan==2.1.0 tensorflow-hub==0.12.0 tensorflow-macos==2.6.0 tensorflow-metadata==1.2.0 tensorflow-metal==0.2.0 tensorflow-probability==0.14.1 tensorflow-similarity==0.13.45 tensorflow-text==2.6.0 termcolor==1.1.0 terminado==0.12.1 testpath==0.5.0 threadpoolctl==3.0.0 top2vec==1.0.26 tornado==6.1 tqdm==4.62.3 traitlets==5.1.0 typing-extensions==3.7.4.3 umap-learn==0.5.1 urllib3==1.26.7 wcwidth==0.2.5 webencodings==0.5.1 websocket-client==1.2.1 Werkzeug==2.0.2 widgetsnbextension==3.5.1 wordcloud==1.8.1 wrapt==1.12.1 zipp==3.6.0
Oct ’21
Reply to [MPSGraph adamUpdateWithLearningRateTensor:beta1Tensor:beta2Tensor:epsilonTensor:beta1PowerTensor:beta2PowerTensor:valuesTensor:momentumTensor:velocityTensor:gradientTensor:name:]: unrecognized selector sent to instance 0x600000eede10
Virtual Environment % pip list Package Version ------------------------ --------- absl-py 0.12.0 anyio 3.3.2 appnope 0.1.2 argon2-cffi 21.1.0 astunparse 1.6.3 attrs 21.2.0 Babel 2.9.1 backcall 0.2.0 bleach 4.1.0 bokeh 2.3.3 cachetools 4.2.4 certifi 2021.5.30 cffi 1.14.6 charset-normalizer 2.0.6 clang 5.0 cloudpickle 2.0.0 cycler 0.10.0 Cython 0.29.24 debugpy 1.5.0 decorator 5.1.0 defusedxml 0.7.1 dill 0.3.4 distinctipy 1.1.5 dm-tree 0.1.6 dotmap 1.3.24 entrypoints 0.3 flatbuffers 1.12 future 0.18.2 gast 0.4.0 gensim 3.8.3 google-auth 1.35.0 google-auth-oauthlib 0.4.6 google-pasta 0.2.0 googleapis-common-protos 1.53.0 grpcio 1.41.0 h5py 3.1.0 hdbscan 0.8.27 idna 3.2 importlib-resources 5.2.2 ipykernel 6.4.1 ipython 7.28.0 ipython-genutils 0.2.0 ipywidgets 7.6.5 jedi 0.18.0 Jinja2 3.0.2 joblib 1.1.0 json5 0.9.6 jsonschema 4.0.1 jupyter-client 7.0.6 jupyter-core 4.8.1 jupyter-server 1.11.1 jupyterlab 3.1.18 jupyterlab-pygments 0.1.2 jupyterlab-server 2.8.2 jupyterlab-widgets 1.0.2 keras 2.6.0 Keras-Preprocessing 1.1.2 kiwisolver 1.3.2 llvmlite 0.37.0 Markdown 3.3.4 MarkupSafe 2.0.1 matplotlib 3.4.3 matplotlib-inline 0.1.3 memory-profiler 0.58.0 mistune 0.8.4 nbclassic 0.3.2 nbclient 0.5.4 nbconvert 6.2.0 nbformat 5.1.3 nest-asyncio 1.5.1 nmslib 2.1.1 notebook 6.4.4 numba 0.54.0 numpy 1.20.3 oauthlib 3.1.1 opt-einsum 3.3.0 packaging 21.0 pandas 1.3.3 pandocfilters 1.5.0 parso 0.8.2 pexpect 4.8.0 pickleshare 0.7.5 Pillow 8.3.2 pip 21.2.4 prometheus-client 0.11.0 promise 2.3 prompt-toolkit 3.0.20 protobuf 3.18.1 psutil 5.8.0 ptyprocess 0.7.0 pyasn1 0.4.8 pyasn1-modules 0.2.8 pybind11 2.6.1 pycparser 2.20 Pygments 2.10.0 pynndescent 0.5.4 pyparsing 2.4.7 pyrsistent 0.18.0 python-dateutil 2.8.2 pytz 2021.3 PyYAML 5.4.1 pyzmq 22.3.0 requests 2.26.0 requests-oauthlib 1.3.0 requests-unixsocket 0.2.0 rsa 4.7.2 scikit-learn 1.0 scipy 1.7.1 Send2Trash 1.8.0 setuptools 47.1.0 six 1.15.0 smart-open 5.2.1 sniffio 1.2.0 tabulate 0.8.9 tensorboard 2.6.0 tensorboard-data-server 0.6.1 tensorboard-plugin-wit 1.8.0 tensorflow 2.6.0 tensorflow-consciousness 0.1 tensorflow-datasets 4.4.0 tensorflow-estimator 2.6.0 tensorflow-gan 2.1.0 tensorflow-hub 0.12.0 tensorflow-macos 2.6.0 tensorflow-metadata 1.2.0 tensorflow-metal 0.2.0 tensorflow-probability 0.14.1 tensorflow-similarity 0.13.45 tensorflow-text 2.6.0 termcolor 1.1.0 terminado 0.12.1 testpath 0.5.0 threadpoolctl 3.0.0 top2vec 1.0.26 tornado 6.1 tqdm 4.62.3 traitlets 5.1.0 typing-extensions 3.7.4.3 umap-learn 0.5.1 urllib3 1.26.7 wcwidth 0.2.5 webencodings 0.5.1 websocket-client 1.2.1 Werkzeug 2.0.2 wheel 0.37.0 widgetsnbextension 3.5.1 wordcloud 1.8.1 wrapt 1.12.1 zipp 3.6.0 (tensorflow-metal) (base) davidlaxer@x86_64-apple-darwin13 ~ %
Oct ’21
Reply to [MPSGraph adamUpdateWithLearningRateTensor:beta1Tensor:beta2Tensor:epsilonTensor:beta1PowerTensor:beta2PowerTensor:valuesTensor:momentumTensor:velocityTensor:gradientTensor:name:]: unrecognized selector sent to instance 0x600000eede10
This code reproduces the crash: test.txt Also, running WITH OUT metal (just CPU) is 4X faster with 'SDG' optimizer. I can't compare the ADAM optimizer since it crashed. In [2]: import tensorflow as tf ...: ...: mnist = tf.keras.datasets.mnist ...: ...: (x_train, y_train), (x_test, y_test) = mnist.load_data() ...: x_train, x_test = x_train / 255.0, x_test / 255.0 ...: ...: model = tf.keras.models.Sequential([ ...: tf.keras.layers.Flatten(input_shape=(28, 28)), ...: tf.keras.layers.Dense(128, activation='relu'), ...: tf.keras.layers.Dropout(0.2), ...: tf.keras.layers.Dense(10) ...: ]) ...: ...: predictions = model(x_train[:1]).numpy() ...: tf.nn.softmax(predictions).numpy() ...: ...: loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True ...: ) ...: ...: loss_fn(y_train[:1], predictions).numpy() ...: ...: model.compile(optimizer = 'adam', loss = loss_fn) ...: model.fit(x_train, y_train, epochs=100) Epoch 1/100 2021-10-10 10:50:53.503460: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled. 2021-10-10 10:50:53.527 python[25080:3485800] -[MPSGraph adamUpdateWithLearningRateTensor:beta1Tensor:beta2Tensor:epsilonTensor:beta1PowerTensor:beta2PowerTensor:valuesTensor:momentumTensor:velocityTensor:gradientTensor:name:]: unrecognized selector sent to instance 0x6000037975a0 zsh: segmentation fault ipython tensorflow_metal (GPU): % time python test.py 2021-10-10 11:34:34.602604: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. Metal device set to: AMD Radeon Pro 5700 XT systemMemory: 128.00 GB maxCacheSize: 7.99 GB 2021-10-10 11:34:34.603850: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:305] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support. 2021-10-10 11:34:34.604642: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:271] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: <undefined>) 2021-10-10 11:34:35.779610: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:185] None of the MLIR Optimization Passes are enabled (registered 2) Epoch 1/100 2021-10-10 11:34:35.929611: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled. 1875/1875 [==============================] - 7s 3ms/step - loss: 0.7213 Epoch 2/100 1875/1875 [==============================] - 6s 3ms/step - loss: 0.38653ms/step - loss: 0.0474 ... Epoch 100/100 1875/1875 [==============================] - 6s 3ms/step - loss: 0.0473 python test.py 721.48s user 375.56s system 173% cpu 10:31.28 total (tensorflow-metal) (base) davidlaxer@x86_64-apple-darwin13 ~ % tensorflow (CPU): % time python ~/test.py 2021-10-10 11:45:44.111971: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. 2021-10-10 11:45:44.487763: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:176] None of the MLIR Optimization Passes are enabled (registered 2) Epoch 1/100 1875/1875 [==============================] - 1s 460us/step - loss: 0.7210 Epoch 2/100 1875/1875 [==============================] - 1s 459us/step - loss: 0.3874 Epoch 3/100 1875/1875 [==============================] - 1s 459us/step - loss: 0.3233 Epoch 4/100 1875/1875 [==============================] - 1s 460us/step - loss: 0.2884 Epoch 5/100 1875/1875 [==============================] - 1s 471us/step - loss: 0.2608 Epoch 6/100 1875/1875 [==============================] - 1s 462us/step - loss: 0.2400 Epoch 7/100 ... Epoch 99/100 1875/1875 [==============================] - 1s 468us/step - loss: 0.0455 Epoch 100/100 1875/1875 [==============================] - 1s 469us/step - loss: 0.0463 python ~/test.py 181.09s user 48.20s system 246% cpu 1:32.86 total (ai) davidlaxer@x86_64-apple-darwin13 text %
Oct ’21