Dear All Developers,
It is so great that we finally have TF-macOS and TF-Metal for GPU/NPU accelerating. After some tests, it looks like everything works well.
So, I am wondering that if it is possible to solve NLP tasks with HuggingFace via TF-Metal for GPU accelerating. To figure it out, I installed all packages we need and ran the testing code.
What I got is showing here. So far so good, right?
However, it pops out an error while I attempt to fine-tune a BERT model.
Colocation Debug Info:
Colocation group had the following types and supported devices:
Root Member(assigned_device_name_index_=2 requested_device_name_='/job:localhost/replica:0/task:0/device:GPU:0' assigned_device_name_='/job:localhost/replica:0/task:0/device:GPU:0' resource_device_name_='/job:localhost/replica:0/task:0/device:GPU:0' supported_device_types_=[CPU] possible_devices_=[]
RealDiv: GPU CPU
Sqrt: GPU CPU
UnsortedSegmentSum: CPU
AssignVariableOp: GPU CPU
AssignSubVariableOp: GPU CPU
ReadVariableOp: GPU CPU
StridedSlice: GPU CPU
NoOp: GPU CPU
Mul: GPU CPU
Shape: GPU CPU
_Arg: GPU CPU
ResourceScatterAdd: GPU CPU
Unique: CPU
AddV2: GPU CPU
ResourceGather: GPU CPU
Const: GPU CPU
It looks like that GPU is not assigned correctly, therefore, I checked if GPU is detected by TensorFlow. And here is the GPU info. from TensorFlow.
WARNING:tensorflow:From <ipython-input-2-17bb7203622b>:1: is_gpu_available (from tensorflow.python.framework.test_util) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.config.list_physical_devices('GPU')` instead.
WARNING:tensorflow:From <ipython-input-2-17bb7203622b>:1: is_gpu_available (from tensorflow.python.framework.test_util) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.config.list_physical_devices('GPU')` instead.
2021-06-29 01:56:25.862829: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:305] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.
2021-06-29 01:56:25.862893: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:271] Created TensorFlow device (/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: <undefined>)
Out[2]: True
Obviously, the problem resulted from HuggingFace. I do know that it is not Apple's responsibility to packages other than TF-macOS and TF-Metal, I am just curious that if anyone has a solution about it here.
Sincerely,
hawkiyc