C++, OpenGL Z-buffer prepass

Asked 25/12, 2010 at 20:6 Answered 21/6, 2011 at 22:3

I'm making a simple voxel engine (think Minecraft) and am currently at the stage of getting rid of occluded faces to gain some precious fps. I'm not very experimented in OpenGL and do not quite understand how the glColorMask magic works.

This is what I have:

// new and shiny
glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);

// this one goes without saying
glEnable(GL_DEPTH_TEST);

// I want to see my code working, so fill the mask
glPolygonMode(GL_FRONT_AND_BACK, GL_FILL);

// fill the z-buffer, or whatever
glDepthFunc(GL_LESS);
glColorMask(0,0,0,0);
glDepthMask(GL_TRUE);

// do a first draw pass
world_display();

// now only show lines, so I can see the occluded lines do not display
glPolygonMode(GL_FRONT_AND_BACK, GL_LINE);

// I guess the error is somewhere here
glDepthFunc(GL_LEQUAL);
glColorMask(1,1,1,1);
glDepthMask(GL_FALSE);

// do a second draw pass for the real rendering
world_display();

This somewhat works, but once I change the camera position the world starts to fade away, I see less and less lines until nothing at all.

Webbing answered 25/12, 2010 at 20:6 Comment(2)

RE: precious FPS, does world_display() already use vertex arrays (VAs) or vertex buffer objects (VBOs) for geometry submission? – Pimp 26/12, 2010 at 0:23

+1 for @genpfault. BTW, can you post some screenshots of what's happening in your case? Unless I'm missing something obvious, the mistake might be somewhere else - the above code looks OK to me. – Cowans 26/12, 2010 at 12:34

It sounds like you are not clearing your depth buffer.

You need to have depth writing enabled (via glDepthMask(GL_TRUE);) while you attempt to clear the depth buffer with glClear. You probably still have it disabled from the previous frame, causing all your clears to be no-ops in subsequenct frames. Just move your glDepthMask call before the glClear.

Discretion answered 26/12, 2010 at 13:2 Comment(3)

D'oh, that's probably the reason, good catch. See kids: That's why you always set all OpenGL state required for an operation, just before that operation. – Sitsang 26/12, 2010 at 15:8

I feel stupid now. Unfortunately it halves my fps rather than double them, dammit! – Webbing 26/12, 2010 at 15:22

Yes. Z pre-pass is only effective when you are fillrate bound (and if you have very expensive per-pixel operations), not vertex bound. – Discretion 26/12, 2010 at 15:27

glColorMask and glDepthMask determine, which parts of the frame buffer are actually written to.

The idea of early Z culling is, to first render only the depth buffer part first -- the actual savings come from sorting the geometry near to far, so that the GPU can quickly discard occluded fragments. However while drawing the Z buffer you don't want to draw the color component: This allows you to switch of shaders, texturing, i.e. in short everything that's computationally intense.

A word of warning: Early Z only works with opaque geometry. Actually the whole depth buffer algorithm only works for opaque stuff. As soon as you're doing blending, you'll have to sort far to near and don't use depth buffering (search for "order independent transparency" for algorithms to overcome the associated problems).

S if you've got anything that's blended, remove it from the 'early Z' stage.

In the first pass you set

glDepthMask(1); // enable depth buffer writes
glColorMask(0,0,0); // disable color buffer writes
glDepthFunc(GL_LESS); // use normal depth oder testing
glEnable(GL_DEPTH_TEST); // and we want to perform depth tests

After the Z pass is done you change the settings a bit

glDepthMask(0); // don't write to the depth buffer
glColorMask(1,1,1); // now set the color component

glDepthFunc(GL_EQUAL); // only draw if the depth of the incoming fragment
                       // matches the depth already in the depth buffer

GL_LEQUAL does the job, too, but also lets fragments even closer than that in the depth buffer pass. But since no update of the depth buffer happens, anything between the origin and the stored depth will overwrite it, each time something is drawn there.

A slight change of the theme is using an 'early Z' populated depth buffer as a geometry buffer in multiple deferred shading passes afterwards.

To save further geometry, take a look into Occlusion Queries. With occlusion queries you ask the GPU how many, if any fragments pass all tests. This being a voxel engine you're probably using an octree or Kd tree. Drawing the spatial dividing faces (with glDepthMask(0), glColorMask(0,0,0)) of the tree's branches before traversing the branch tells you, if any geometry in that branch is visible at all. That combined with a near to far sorted traversal and a (coarse) frustum clipping on the tree will give you HUGE performance benefits.

Sitsang answered 26/12, 2010 at 11:37 Comment(2)

Very good remarks! But the OP's code seems to be coherent with your description of early-z. – Cowans 26/12, 2010 at 12:34

world_display() may do unfortunate things, like use blending, may change some states and the like. In priciple for 'early Z' to work you need three separate render paths: One just for the depth pass, the opaque pass and finally for everything translucent. – Sitsang 26/12, 2010 at 13:55

z-pre pass can work with translucent objects. if they are translucent, do not render them in the prepass, then zsort and render.

Spencer answered 21/6, 2011 at 22:3 Comment(0)

Recommended topics

Hot tags