How do you synchronize an MTLTexture and IOSurface across processes?
Asked Answered
K

1

6

What APIs do I need to use, and what precautions do I need to take, when writing to an IOSurface in an XPC process that is also being used as the backing store for an MTLTexture in the main application?

In my XPC service I have the following:

IOSurface *surface = ...;
CIRenderDestination *renderDestination = [... initWithIOSurface:surface];

// Send the IOSurface to the client using an NSXPCConnection.
// In the service, periodically write to the IOSurface.

In my application I have the following:

IOSurface *surface = // ... fetch IOSurface from NSXPConnection.
id<MTLTexture> texture = [device newTextureWithDescriptor:... iosurface:surface];

// The texture is used in a fragment shader (Read-only)

I have an MTKView that is running it's normal update loop. I want my XPC service to be able to periodically write to the IOSurface using Core Image and then have the new contents rendered by Metal on the app side.

What synchronization is needed to ensure this is done properly? A double or triple buffering strategy is one, but that doesn't really work for me because I might not have enough memory to allocate 2x or 3x the number of surfaces. (The example above uses one surface for clarity, but in reality I might have dozens of surfaces I'm drawing to. Each surface represents a tile of an image. An image can be as large as JPG/TIFF/etc allows.)

WWDC 2010-442 talks about IOSurface and briefly mentions that it all "just works", but that's in the context of OpenGL and doesn't mention Core Image or Metal.

I originally assumed that Core Image and/or Metal would be calling IOSurfaceLock() and IOSurfaceUnlock() to protect read/write access, but that doesn't appear to be the case at all. (And the comments in the header file for IOSurfaceRef.h suggest that the locking is only for CPU access.)

Can I really just let Core Image's CIRenderDestination write at-will to the IOSurface while I read from the corresponding MTLTexture in my application's update loop? If so, then how is that possible if, as the WWDC video states, all textures bound to an IOSurface share the same video memory? Surely I'd get some tearing of the surface's content if reading and writing occurred during the same pass.

Kelly answered 3/2, 2019 at 15:46 Comment(0)
S
5

The thing you need to do is ensure that the CoreImage drawing has completed in the XPC before the IOSurface is used to draw in the application. If you were using either OpenGL or Metal on both sides, you would either call glFlush() or [-MTLRenderCommandEncoder waitUntilScheduled]. I would assume that something in CoreImage is making one of those calls.

I can say that it will likely be obvious if that's not happening because you will get tearing or images that are half new rendering and half old rendering if things aren't properly synchronized. I've seen that happen when using IOSurfaces across XPCs.

One thing you can do is put some symbolic breakpoints on -waitUntilScheduled and -waitUntilCompleted and see if CI is calling them in your XPC (assuming the documentation doesn't explicitly tell you). There are other synchronization primitives in Metal, but I'm not very familiar with them. They may be useful as well. (It's my understanding that CI is all Metal under the hood now.)

Also, the IOSurface object has methods -incrementUseCount, -decrementUseCount, and -localUseCount. It might be worth checking those to see if CI sets them appropriately. (See <IOSurface/IOSurfaceObjC.h> for details.)

Sivie answered 3/2, 2019 at 19:32 Comment(6)
CIRenderTask waitUntilCompletedAndReturnError will ensure that Core Image is finished but using that requires that I somehow block the Metal render loop in the application when I initiate the Core Image rendering and then unblock the render loop when I'm done. Apple's sample code uses two IOSurfaces to achieve this and requires that the service constantly tell the app which surface it can read from. I'd like to avoid double or triple buffering to reduce memory overhead. MTLSharedEventHandle looks promising, but it's only for 10.14+ and documentation is very thin on it...Kelly
IOSurface useCount doesn't appear to have any bearing on what you can do with a surface, at least from what I can tell. Core Image and Metal appear to be able to read and write to a surface regardless of the value of useCount. The only thing I've found that depends on the useCount is a CVPixelBufferPoolRef that uses the value of the useCount to know when it can safely recycle an IOSurface backed pixel buffer.Kelly
It sounds like what you really need is better memory management. Why do you need to keep such large images in memory? (Also, is this macOS or iOS?) Do you need to keep the entire image in one contiguous block or could these large images be tiled so you can do some reasonable caching of the tiles?Sivie
On macOS. The XPC service has a single CIImage that could represent a very large image. IOSurface and MTLTexture are both limited to around 16,384 per edge so I tile the image by creating multiple surfaces and then use Core Image's ability to render a specific region to a tile's surface. There are enough surfaces loaded into memory to represent the image at full resolution. When I need to draw one or more of those surfaces in the app's MTKView, I wrap them in a Metal texture and blit to a drawable. This has proven to be way faster than using Core Image to draw the tile on-the-fly.Kelly
Have you considered mip-mapping them? If you're not drawing the entire image, you can have just what you are drawing in VRAM (maybe with some extra to anticipate transformations). If you are drawing the entire image, you don't need the full resolution since your user's monitor isn't large enough to hold it. You could use a higher mipmap level and not even include the base image for significant memory savings.Sivie
I use mipmap biasing in the fragment shader depending on which level of detail is the most up-to-date. The CIImage (with filters) is first rendered at a lower resolution and then rendered at full resolution on the second pass. The XPC service has a tile set for each "mipmap" level it can render at. On the application, I just blit that surface's contents into the textures appropriate mipmap slice. For optimum panning and zooming of the canvas, I want as much of the image as possible ready-to-go, dependent on system memory, hence the wish to have so many surfaces allocated.Kelly

© 2022 - 2024 — McMap. All rights reserved.