Dear All,
There were years waiting for GPU acceleration for Mac users, and the frameworks were finally released this year. After some testing, I found out that the NLP task isn’t supported in the previous versions and made a report in 'FB9220496'. Then I was informed that this issue would be fixed in the next versions.
So, I updated my frameworks immediately when the latest versions were released. However, I am very disappointed and frustrated because the latest frameworks cannot work at all. IDE Kernel always crashes when I am trying to train any model. Nevertheless, I attached the copy of my env in the .yaml file, so you should be able to reproduce this issue very easily.
Here is the link of the .yaml file
https://www.icloud.com/iclouddrive/0Yhr444NAqu6oD8qxOfKriY-A#env%5Fmac%5Ftf
Besides that, please also note that I only update TesnorFlow-macos and TensorFlow-metal. And this env was fine except for NLP tasks before I update the frameworks. Could you please kindly look into this issue and solve it, thank you very much.
BTW, the IDE which I use is spyder 5.1.5. Though IDE should not be the reason for crashes.
I appreciate your time and looking forward to your prompt reply.
Sincerely,
Gavin
Post
Replies
Boosts
Views
Activity
Dear all Developers,
I update my mac to macOS 12 beta 3 a week ago, and I tried to backup my mac in these days. However, the time machine always pops out an error showing in the attachment. I attempted to solve it according to the following instructions, yet, it does not work.
https://softwaretested.com/mac/time-machine-snapshot-could-not-be-created-for-the-disk-error/#What_is_Disk_in_Time_Machine
The only way to solve it is to backup my mac in safe mode. However, it is very time-consuming
and not a realistic solution.
Sincerely,
hawkiyc
Dear all Developers,
I am trying to reinstall my macOS Monterey, however, the process of installing is always stuck. It would show "PKDownloadError error 8" in some scenarios, while in the others, my Mac just went to a black screen and no response after lone-waiting.
I've looked into this issue, and some websites indicated PKDownloadError results from the internet connection. However, I can update my macOS with my iPhone's hotspot in normal mode without any issue.
I appreciate and looking forward to your kind assistance.
Sincerely,
hawkiyc
Dear All Developers,
After macOS 12 Beta 3, there is a malfunction while adding a new tab or open a website in the new tab on Safari. The new tab is indeed created but not showed on top, and you have to click another tab for showing which one you just created. It isn't a big bug, but very annoying. Have anyone face this bug too?
For more information, I have upload clips about this issue in the following google drive links, please find more detail in them.
https://drive.google.com/file/d/1snH74s5H4aLf5ISzfXYmz9IRGfR-u_2G/view?usp=sharing
https://drive.google.com/file/d/1jd_LMVh6ZENrrbZSEZiJahKOuNYi3979/view?usp=sharing
Sincerely,
hawkiyc
Dear All Developers,
I have reported an issue about the HuggingFace package on 683992.
In the beginning, I thought the problem is from HuggingFace. However, I found out it seems results from TensorFlow-Hub after some further tests.
Here is the thing, I made a fine-tuning BERT model with TF and TF-Hub only. And I got the same error as before.
Here is the detail about the error.
InvalidArgumentError: Cannot assign a device for operation AdamWeightDecay/AdamWeightDecay/update/Unique: Could not satisfy explicit device specification '/job:localhost/replica:0/task:0/device:GPU:0' because no supported kernel for GPU devices is available.
Colocation Debug Info:
Colocation group had the following types and supported devices:
Root Member(assigned_device_name_index_=2 requested_device_name_='/job:localhost/replica:0/task:0/device:GPU:0' assigned_device_name_='/job:localhost/replica:0/task:0/device:GPU:0' resource_device_name_='/job:localhost/replica:0/task:0/device:GPU:0' supported_device_types_=[CPU] possible_devices_=[]
RealDiv: GPU CPU
ResourceGather: GPU CPU
AddV2: GPU CPU
Sqrt: GPU CPU
Unique: CPU
ResourceScatterAdd: GPU CPU
UnsortedSegmentSum: CPU
AssignVariableOp: GPU CPU
AssignSubVariableOp: GPU CPU
ReadVariableOp: GPU CPU
NoOp: GPU CPU
Mul: GPU CPU
Shape: GPU CPU
Identity: GPU CPU
StridedSlice: GPU CPU
_Arg: GPU CPU
Const: GPU CPU
So, obviously, there is something wrong with the TF part and I don't think there is a quick solution.
As transformers and related models are so powerful in the NLP area, it is a great shame that if we cannot solving NLP tasks with GPU accelerating.
I will raise this issue on Feedback Assistant App too, and please comment here if you would also like Apple to solve this issue.
Sincerely,
hawkiyc
Dear All Developers,
It is so great that we finally have TF-macOS and TF-Metal for GPU/NPU accelerating. After some tests, it looks like everything works well.
So, I am wondering that if it is possible to solve NLP tasks with HuggingFace via TF-Metal for GPU accelerating. To figure it out, I installed all packages we need and ran the testing code.
What I got is showing here. So far so good, right?
However, it pops out an error while I attempt to fine-tune a BERT model.
Colocation Debug Info:
Colocation group had the following types and supported devices:
Root Member(assigned_device_name_index_=2 requested_device_name_='/job:localhost/replica:0/task:0/device:GPU:0' assigned_device_name_='/job:localhost/replica:0/task:0/device:GPU:0' resource_device_name_='/job:localhost/replica:0/task:0/device:GPU:0' supported_device_types_=[CPU] possible_devices_=[]
RealDiv: GPU CPU
Sqrt: GPU CPU
UnsortedSegmentSum: CPU
AssignVariableOp: GPU CPU
AssignSubVariableOp: GPU CPU
ReadVariableOp: GPU CPU
StridedSlice: GPU CPU
NoOp: GPU CPU
Mul: GPU CPU
Shape: GPU CPU
_Arg: GPU CPU
ResourceScatterAdd: GPU CPU
Unique: CPU
AddV2: GPU CPU
ResourceGather: GPU CPU
Const: GPU CPU
It looks like that GPU is not assigned correctly, therefore, I checked if GPU is detected by TensorFlow. And here is the GPU info. from TensorFlow.
WARNING:tensorflow:From <ipython-input-2-17bb7203622b>:1: is_gpu_available (from tensorflow.python.framework.test_util) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.config.list_physical_devices('GPU')` instead.
WARNING:tensorflow:From <ipython-input-2-17bb7203622b>:1: is_gpu_available (from tensorflow.python.framework.test_util) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.config.list_physical_devices('GPU')` instead.
2021-06-29 01:56:25.862829: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:305] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.
2021-06-29 01:56:25.862893: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:271] Created TensorFlow device (/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: <undefined>)
Out[2]: True
Obviously, the problem resulted from HuggingFace. I do know that it is not Apple's responsibility to packages other than TF-macOS and TF-Metal, I am just curious that if anyone has a solution about it here.
Sincerely,
hawkiyc
Dear All,
I updated my MacBook Pro 16" several days ago. It works fine mostly, but booting is very slow. So, I tried something we usually did to solve this issue. And following is what I have tried.
Reset MSC
Reset NVRAM/PRAM
Reset User Permission
But all of them didn't work, so I ran the Apple Diagnostic Tool and it said everything is fine on my Mac.
After that, I tried to run safe mode to find out which app slowdown my Mac, and the weirdest thing happened. I can't boot my Mac at safe mode at all. The screen just turns black and keeps showing the loading symbol.
Typically, we would have a computer that can't start up normally but works in safe mode. So we can figure out which app result in the system failure. But it is not in my case.
I appreciate your kind assistance.
Sincerely,
Dear All,
I am a Data Scientist and waiting for GPU accelerating for years, and I am thrilled while Apple announced it will come at MacOS 12.
linkText
And so, I updated my OS to Monterey Beta and tried to install TensorFlow-Metal a few days ago. However, all installing instruction commands not work at all.
After that, I looked into pypi.org and found out there are whl files for TensorFlow-macos and TensorFlow-metal. So, I tried to pip install both whl files. Yet, noting work again.
Here is the screenshot for installing.
I would very much appreciate if you can help me to solve this issue.
Sincerely,