I am attempting to build an app to accomplish nearly the exact same thing. We have reached the point where we have successfully streamed H.264 from a raspberrypi camera to an iPhone. The video is successfully being displayed on the iPhone. The data is currently represented as CMSampleBuffers, but need to be converted to CVPixelBuffers.
Between using ffmpeg or feeding CMSampleBUffers, may I ask which route ended up working best for you? And if you can provide any insight or resources on how.