In a tensorflow-metal virtual environment on OS X 12.1:
tensorboard 2.6.0
tensorboard-data-server 0.6.1
tensorboard-plugin-profile 2.5.0
tensorboard-plugin-wit 1.8.0
tensorflow 2.6.0
tensorflow-addons 0.14.0
tensorflow-consciousness 0.1
tensorflow-datasets 4.4.0
tensorflow-estimator 2.7.0
tensorflow-gan 2.1.0
tensorflow-hub 0.12.0
tensorflow-io-gcs-filesystem 0.22.0
tensorflow-macos 2.7.0
tensorflow-metadata 1.2.0
tensorflow-metal 0.3.0
tensorflow-probability 0.14.1
tensorflow-similarity 0.13.45
tensorflow-text 2.7.3
Running the Top2vec model: https://github.com/ddangelov/Top2Vec
import numpy as np
import pandas as pd
import json
import os
import ipywidgets as widgets
from IPython.display import clear_output, display
from top2vec import Top2Vec
papers_prepared_df = pd.read_feather("/Users/davidlaxer/Downloads/archive/covid19_papers_processed.feather")
top2vec_trained = Top2Vec(documents=papers_prepared_df.text.tolist(), embedding_model="universal-sentence-encoder", use_embedding_model_tokenizer=True, embedding_model_path="/Users/davidlaxer/Downloads/universal-sentence-encoder_4/", workers=4)
2021-12-20 06:30:52,188 - top2vec - INFO - Pre-processing documents for training
/Users/davidlaxer/tensorflow-metal/lib/python3.8/site-packages/sklearn/utils/deprecation.py:87: FutureWarning: Function get_feature_names is deprecated; get_feature_names is deprecated in 1.0 and will be removed in 1.2. Please use get_feature_names_out instead.
warnings.warn(msg, category=FutureWarning)
2021-12-20 06:31:57,351 - top2vec - INFO - Loading universal-sentence-encoder model at /Users/davidlaxer/Downloads/universal-sentence-encoder_4
2021-12-20 06:31:57.488459: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-12-20 06:31:57.489288: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:305] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.
2021-12-20 06:31:57.489490: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:271] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: <undefined>)
Metal device set to: AMD Radeon Pro 5700 XT
2021-12-20 06:31:59.447260: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2021-12-20 06:32:00,841 - top2vec - INFO - Creating joint document/word embedding
2021-12-20 06:32:00.923838: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
Some resource has been exhausted.
For example, this error might be raised if a per-user quota is
exhausted, or perhaps the entire file system is out of space.
@@__init__
2 root error(s) found.
(0) RESOURCE_EXHAUSTED: OOM when allocating tensor with shape[114389,320] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator Simple allocator
[[{{node EncoderDNN/EmbeddingLookup/EmbeddingLookupUnique/GatherV2}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. This isn't available when running in Eager mode.
[[StatefulPartitionedCall/StatefulPartitionedCall/EncoderDNN/EmbeddingLookup/EmbeddingLookupUnique/Reshape_1/_188]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. This isn't available when running in Eager mode.
(1) RESOURCE_EXHAUSTED: OOM when allocating tensor with shape[114389,320] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator Simple allocator
[[{{node EncoderDNN/EmbeddingLookup/EmbeddingLookupUnique/GatherV2}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. This isn't available when running in Eager mode.
...
I tried adjusting the batchsize (e.g - 500, 100, 50, 10, 5).