tf.config.list_physical_devices() - to list available devices. You would see something like "/physical_device:CPU:0" in the list. Modify it to be "/CPU:0", and then execute your code like this: with tf.device("/CPU:0"):. I've managed to speed up my training x20, although by default it could be seen that tf uses cpu anyway (it is seen in Tensorboard profiler logs). I guess explicitly optin in for only cpu saved TF some energy on making decisions about allocation.
This in itself sounds like a bug - that explicitly stating the device works much faster than just default training