As stated in the Metal Shading Language Guide:
Writes to a buffer or a texture are disallowed from a fragment function.
I understand that this is the case, but I'm curious as to why. Being able to write to a buffer from within a fragment shader is incredibly useful; I understand that it is likely more complex on the hardware end to not know ahead of time the end location of memory writes for a particular thread, which you don't always know with raw buffer writes, but this is a capability exposed within Metal compute shaders, so why not within fragment shaders too?
Addendum
I should clarify why I think buffer writes from fragment functions are useful. In the most common usage case of the rasterization pipeline, triangles are being rasterized and shaded (per the fragment shader) and written into predefined memory locations, known before each fragment shader invocation and determined by the predefined mapping from the normalized device coordinates and the frame buffer. This fits most usage cases, since most of the time you just want to render triangles directly to a buffer or the screen.
There are other cases in which you might want to do a lazy write within the fragment shader, the end location of which is based off of fragment properties and not the fragment's exact location; effectively, rasterization with side effects. For instance, most GPU-based voxelization works by rendering the scene with orthographic projection from some desirable angle, and then writing into a 3D texture, mapping the XY coordinates of the fragment and its associated depth value to a location in the 3D texture. This is described here.
Other uses include some forms of order-independent transparency (transparency where draw order is unimportant, allowing for overlapping transparent objects). One solution is to use a multi-layered frame buffer, and then to sort and blend the fragments based upon their depth values in a separate pass. Since there's no hardware support for doing this (on most GPUs, Intel's I believe have hardware acceleration for this), you have to maintain atomic counters and manual texture/buffer writes from each pixel to coordinate writes to the layered frame buffer.
Yet another example might be extraction of virtual point lights for GI through rasterization (i.e. you write out point lights for relevant fragments as you rasterize). In all of these usage cases, buffer writes from fragment shaders are required, because ROPs only store one resulting fragment for each pixel. The only way to achieve equivalent results without this feature is by some manner of depth peeling, which is horribly slow for scenes of high depth complexity.
Now I realize that the examples I gave aren't really all about buffer writes in particular, but more generally about the idea of dynamic memory writes from fragment shaders, ideally along with support for atomicity. Buffer writes just seem like a simple issue, and their inclusion would go a long way towards improving the situation.
Since I wasn't getting any answers here, I ended up posting the question on Apple's developer forums. I got more feedback there, but still no real answer. Unless I am missing something, it seems that virtually every OS X device which officially supports Metal has hardware support for this feature. And as I understand, this feature first started popping up in GPUs around 2009. It's a common feature in both current DirectX and OpenGL (not even considering DX12 or Vulkan), so Metal would be the only "cutting-edge" API which lacks it.
I realize that this feature might not be supported on PowerVR hardware, but Apple has had no issue differentiating the Metal Shading Language by feature set. For instance, Metal on iOS allows for "free" frame buffer fetches within fragment shaders, which is directly supported in hardware by the cache-heavy PowerVR architecture. This feature manifests itself directly in the Metal Shading Language, as it allows you to declare fragment function inputs with the [[color(m)]]
attribute qualifier for iOS shaders. Arguably allowing declaration of buffers with the device
storage space qualifier, or textures with access::write
, as inputs to fragment shaders, would be no greater semantic change to the language than what Apple has done to optimize for iOS. So, as far as I'm concerned, a lack of support by PowerVR would not explain the lack of the feature I'm looking for on OS X.