Infinite cube world engine (like Minecraft) optimization suggestions?

Asked 9/1, 2014 at 10:47 Answered 26/7, 2014 at 9:56

opengl optimization minecraft cube voxel

Voxel engine (like Minecraft) optimization suggestions?

As a fun project (and to get my Minecraft-adict son excited for programming) I am building a 3D Minecraft-like voxel engine using C# .NET4.5.1, OpenGL and GLSL 4.x.

Right now my world is built using chunks. Chunks are stored in a dictionary, where I can select them based on a 64bit X | Z<<32 key. This allows to create an 'infinite' world that can cache-in and cache-out chunks.

Every chunk consists of an array of 16x16x16 block segments. Starting from level 0, bedrock, it can go as high as you want (unlike minecraft where the limit is 256, I think).
Chunks are queued for generation on a separate thread when they come in view and need to be rendered. This means that chunks might not show right away. In practice you will not notice this. NOTE: I am not waiting for them to be generated, they will just not be visible immediately.
When a chunk needs to be rendered for the first time a VBO (glGenBuffer, GL_STREAM_DRAW, etc.) for that chunk is generated containing the possibly visible/outside faces (neighboring chunks are checked as well). [This means that a chunk potentially needs to be re-tesselated when a neighbor has been modified]. When tesselating first the opaque faces are tesselated for every segment and then the transparent ones. Every segment knows where it starts within that vertex array and how many vertices it has, both for opaque faces and transparent faces.
Textures are taken from an array texture.

When rendering;

I first take the bounding box of the frustum and map that onto the chunk grid. Using that knowledge I pick every chunk that is within the frustum and within a certain distance of the camera.
Now I do a distance sort on the chunks.
After that I determine the ranges (index, length) of the chunks-segments that are actually visible. NOW I know exactly what segments (and what vertex ranges) are 'at least partially' in view. The only excess segments that I have are the ones that are hidden behind mountains or 'sometimes' deep underground.
Then I start rendering ... first I render the opaque faces [culling and depth test enabled, alpha test and blend disabled] front to back using the known vertex ranges. Then I render the transparent faces back to front [blend enabled]

Now... does anyone know a way of improving this and still allow dynamic generation of an infinite world? I am currently reaching ~80fps@1920x1080, ~120fps@1024x768 (screenshots: https://i.sstatic.net/t4k30.jpg, https://i.sstatic.net/prV8X.jpg) on an average 2.2Ghz i7 laptop with a ATI HD8600M gfx card. I think it must be possible to increase the number of frames. And I think I have to, as I want to add entity AI, sound and do bump and specular mapping. Could using Occlusion Queries help me out? ... which I can't really imagine based on the nature of the segments. I already minimized the creation of objects, so there is no 'new Object' all over the place. Also as the performance doesn't really change when using Debug or Release mode, I don't think it's the code but more the approach to the problem.

edit: I have been thinking of using GL_SAMPLE_ALPHA_TO_COVERAGE but it doesn't seem to be working?

gl.Enable(GL.DEPTH_TEST);
gl.Enable(GL.BLEND); // gl.Disable(GL.BLEND);
gl.Enable(GL.MULTI_SAMPLE);
gl.Enable(GL.SAMPLE_ALPHA_TO_COVERAGE);

Navicert answered 9/1, 2014 at 10:47 Comment(9)

Maybe try adding the extra stuff first, and then see if you need to optimise. Especially if this is a project to get your son interested in programming, you probably want to keep it simple and accessible at first. If however you feel compelled to do some optimisation, perhaps store your chunks as sparse octal trees, this way you can cull a lot of the non-visible voxels before rendering and send just the hull to the GPU, this is a technique used by most voxel engines. (Note Minecraft does not actually use a voxel engine itself). – Fitting 9/1, 2014 at 12:39

Yeah, true. Minecraft doesn't really use voxels, but polygonal cubes (changed the title). So my question is now, how can I improve on what I am already doing? ... Rendering a massive amount of cubes. I've read octrees don't really help too much when rendering a grid based cube world. – Defective 9/1, 2014 at 13:31

@It'sme...Alex Profile your application to find the bottleneck. – Papaya 9/1, 2014 at 13:42

@It'sme...Alex yeah, they are only really helpful when the voxel resolution is quite high (so that culling everything apart from the hull voxels makes an appreciative difference) – Fitting 9/1, 2014 at 13:49

I think you can get away with not looking at neighbors during tessellation – Thirlage 9/1, 2014 at 13:54

@danijar: profiling doesnt help too much. There is not much time spend in code. I am not tesselating the chunks everytime. I generate a VBO once. Then only when something has changed in the chunk I retesselate. But that doesnt happen to often. – Defective 9/1, 2014 at 19:2

@It'sme...Alex That sounds kind of ironic. How would you optimize you application while not reducing the time spend in code? By profiling, you find out what part of your application is worth optimizing most. – Papaya 9/1, 2014 at 19:7

@user2485710: minecraft is, like asQuirreL, already pointed out not really a voxel game. It uses polygon cubes, not voxels. Although maybe you can use a similar approach on them. Also I did a lot of research. That's why I am reaching 160fps now (found out that I was rendering opaque cubes twice) ... but I have the feeling that I must be able to do it faster. – Defective 9/1, 2014 at 19:10

@danjijar: well not really. Maybe you could say that I send too many triangles to the graphics card ... it's hard to profile that. And I do that with one call to glDrawArrays. Or maybe I am not spending 'enough' time finding out what cubes not to render. The CPU is only at 9% with peeks to 30%. So it's not as if it's strugling. – Defective 9/1, 2014 at 19:12

To render a lot of similar objects, I strongly suggest you take a look into instanced draw : glDrawArraysInstanced and/or glDrawElementsInstanced.

It made a huge difference for me. I'm talking from 2 fps to over 60 fps to render 100000 similar icosahedrons.

You can parametrize your cubes by using Attribs ( glVertexAttribDivisor and friends ) to make them differents. Hope this helps.

Hubert answered 9/1, 2014 at 16:29 Comment(4)

glDrawArraysInstanced sounds promising in combination with an array texture. Then I just have to find out how to position the cubes correctly. This could also drastically reduce the amount of memory that's being used. – Defective 9/1, 2014 at 19:26

Just realize that it might not be the solution after all. As not 'everything' is a cube. There will be blocks like fences and stairs as well. – Defective 9/1, 2014 at 19:50

Well it could speed up things for the cubes at least. The position of the cubes can be an instance attribute. Let me know if you need more info on instance attributes. – Hubert 9/1, 2014 at 20:1

And blocks like stairs and fences, Just call a drawInstanced for them as well. It's still 3 calls instead of N. Anyway, keep up posted, this is an interesting topic. – Hubert 9/1, 2014 at 21:26

It's on ~200fps currently, should be OK. The 3 main things that I've done are:

1) generation of both chunks on a separate thread. 2) tessellation the chunks on a separate thread. 3) using a Deferred Rendering Pipeline.

Don't really think the last one contributed much to the overall performance but had to start using it because of some of the shaders. Now the CPU is sort of falling asleep @ ~11%.

Navicert answered 31/1, 2014 at 14:29 Comment(1)

After not working on the engine for a couple of weeks I added Order Independent Transparency and Occlusion Queries. Now there is one thing left ... having proper biomes oh yeah, animated textures. – Defective 11/3, 2014 at 15:10

This question is pretty old, but I'm working on a similar project. I approached it almost exactly the same way as you, however I added in one additional optimization that helped out a lot.

For each chunk, I determine which sides are completely opaque. I then use that information to do a flood fill through the chunks to cull out the ones that are underground. Note, I'm not checking individual blocks when I do the flood fill, only a precomputed bitmask for each chunk.

When I'm computing the bitmask, I also check to see if the chunk is entirely empty, since empty chunks can obviously be ignored.

Streamlet answered 26/7, 2014 at 9:56 Comment(0)

Recommended topics

Hot tags