Can anyone recommend a really fast API, ideally NEON-optimized for doing YUV to RGB conversion at runtime on the iPhone using the CPU? The accelerate framework's vImage doesn't provide anything suitable, sadly, and using vDSP, converting to floats and back seems suboptimal and almost as much work as writing NEON myself.
I know how to use the GPU for this via a shader, and in fact already do so for displaying my main video plane. Unfortunately, I also need to create and save RGBA textures of subregions of the display at runtime. Most of the good answers to this question involve shaders, but I don't want to use the GPU for that additional work, because:
(1) Although I could use RenderTextures and my YUV shader to convert and cache the regions, I don't want to add any more synchronization/complexity to the app. (I already pass textures from a CVTextureCache to Unity3D... I'm switching state from OpenGL behind Unity3D's back in many cases already and don't want to do any more debugging...)
(2) More practically I am writing a game, and don't have any GPU to spare (as games generally don't - I've given more presentations about how to get things off the GPU in the last few years than how to put things on it...)
(3) On the iPad, I have a spare core sitting there doing nothing.
Whilst there are many libraries out there that will do YUV to RGBA, I'd love to save the time of writing my own NEON version. Right now I'm using OpenCV's implementation like this:
cv::cvtColor(avFoundationYUVCaptureMat, BGRAInputImage, CV_YUV420sp2BGRA, 4);
which is correct, but end-of-days slow.
If anyone has previously looked at other implementations (CoreImage? FFMpeg?) and can recommend one I'd be hugely grateful.
Thanks, Alex.