My VoIP app has the same issues.
Regarding your step 9, there are some other ways for me to increase the volume.:
AVAudioSession.sharedInstance().overrideOutputAudioPort(.none)
AVAudioSession.sharedInstance().overrideOutputAudioPort(.speaker)
or the user changes the output to the receiver and then switches to the speaker (without using .defaultToSpeaker).
I think when using .playAndRecord, all sounds will be played with a low volume. Additionally, the Voice Processing IO can further decrease or increase the volume. So, there are three different cases when it comes to volume:
Low when using .playAndRecord only
Lowest when using Voice Processing IO
Highest after step 9
I tried using Remote I/O and my sound always plays at a stable low volume. I can easily fix this issue by changing the sound file. However, I encountered a bigger problem - the volume during conversation is very low. So, I have decided to abandon this solution.
To conclude, my main objective is to achieve a stable volume in Voice Processing IO.