I encountered a similar problem on an iMac with a 3.6GHz quad-core Intel Core i7 and Radeon Pro 560 4 GB. Adam crashes repeatedly. I tried using tfa.optimizer.RectifiedAdam instead. It works (and uses the GPU!) but it is significantly slower than Adam on the CPU. The HuggingFace glue/mrpc fine-tuning example takes ~15min per epoch on the CPU, but over an hour on the GPU.