Semantics of barrier() in opengl compute shader

Let's say I have an opengl compute shader written in GLSL, executing on a NVidia Geforce 970.

At the start of the shader, a single invocation writes to a "Shader Storage Buffer Object" (SSBO).

I then issue a suitable barrier, like memoryBarrier() in my GLSL.

I then read from the memory written in the first step, in each invocation.

Will that first write be visible to all invocations in the current compute operation?

At https://www.khronos.org/opengl/wiki/Memory_Model#Ensuring_visibility , Khronos say:

"Use coherent and an appropriate memoryBarrier* or groupMemoryBarrier call if you use a mechanism like barrier to synchronize between invocations."

I'm pretty sure it's possible to synchronize this way within a work group. But does it work for all invocations in every work group, in the entire compute operation?

I'm unsure how an entire set of work groups is scheduled. I would expect them to possibly run sequentially, making the kind of synchronization I'm asking about impossible?

But does it work for all invocations in every work group, in the entire compute operation?

No. The scope of barrier is explicitly within a work group. And you cannot have visibility of operations that you haven't ensured have happened yet. The order of execution of work groups with respect to one another is undefined, so you don't know if one work group has executed yet.

What you want isn't really possible. You need instead to change how your shaders work so that work groups are not dependent on each other. In this case, you can have every work group perform this computation. And instead of storing it in global memory via an SSBO, store the result in a shared variable.

Yes, you'll be computing the same value in each group. But that will yield better performance than having all of those work groups wait on one work group. Especially since that's not something you can actually do.

Recommended topics

Hot tags