Does interleaving in VBOs speed up performance when using VAOs
Asked Answered
G

2

14

You usually get a speed up when you use interleaved VBOs instead of using multiple VBOs. Is this also valid when using VAOs?

Because it's much more convenient to have a VBO for the positions, and one for the normals etc. And you can use one VBO in multiple VAOs.

Giorgi answered 17/9, 2013 at 15:19 Comment(6)
One thing to keep in mind, VAOs are just thinly wrapped state containers. This question could (and should) just as easily be asked about VBOs in general. The only thing VAOs do is keep your vertex pointers and various other vertex array states around persistently so you do not have to make a million API calls every time you want to draw something.Dhow
It's worth watching Valve's Linux source porting talk (the whole talk is fairly interesting, most of it's OpenGL related rather than Linux specific). Apparently they found VAO to actually have worse performance. Normally you don't change most of those states when drawing a bunch of objects unless your objects are in a bunch of different formats.Chamberlin
Really? I would have thought Valve would avoid deprecated OpenGL for improved forward compatibility. You have to use at least one VAO for core OpenGL... If any platform ever does force core OpenGL, I would wager my money on Apple, who "think different" (incompatible), and whose OS makes up a noteworthy portion of the Steam storefront. Great presentation though.Dhow
@AndonM.Coleman: Apple and Intel on Linux do think differently. There is no compat profile fro GL3.2 on OSX and Intel will not implement ARB_compatibility for GL 3.2+ on Linux.Prison
They still provide fallbacks for older versions of OpenGL though. I was referring to a situation where you could not get ANY version of OpenGL but core 3.2+. SO will be flooded with people trying to get immediate mode tutorials to work if/when this happens.Dhow
@AndonM.Coleman: I sincerely hope not. We got enough trouble on the official GL forums with legacy stuff as it is.Prison
W
20

VAOs

  • For sharing larger data sets, a dedicated buffer containing a single vertex (attrib) array is surely a way to go, while one could still interleave specific arrays in another buffer and combine them using a VAO.

  • A VAO handles the binding of all those buffers and the vertex (attrib) array states such as array buffer bindings and attrib entries with (buffer) pointers and enable/disable flags. Aside from its convenience, it is designed for doing this job quickly, not to mention the simple API call, which changes all states at once, without the tedious enabling and disabling of attrib arrays. It basically does, what we had to do manually before. However, with my own VAO-like implementation, I could not measure any performance loss, even when doing lots of binds. From my point of view, the major advantage is its convenience.

So, a VAO doesn't decide on drawing performance in terms of glDraw*, but it can have an impact on the overhead of state changes.

Interleaved data formats...

  • ...cause less GPU cache pressure, because the vertex coordinate and attributes of a single vertex aren't scattered all over in memory. They fit consecutively into few cache lines, whereas scattered attributes could cause more cache updates and therefore evictions. The worst case scenario could be one (attribute) element per cache line at a time because of distant memory locations, while vertices get pulled in a non-deterministic/non-contiguous manner, where possibly no prediction and prefetching kicks in. GPUs are very similar to CPUs in this matter.

  • ...are also very useful for various external formats, which satisfy the deprecated interleaved formats, where datasets of compatible data sources can be read straight into mapped GPU memory. I ended up re-implementing these interleaved formats with the current API for exactly those reasons.

  • ...should be layouted alignment friendly just like simple arrays. Mixing various data types with different size/alignment requirements may need padding to be GPU and CPU friendly. This is the only downside I know of, appart from the more difficult implementation.

  • ...do not prevent you from pointing to single attrib arrays in them for sharing.

Interleaving will most probably improve draw performance.

Conclusion:

From what I experienced, it is best to have cleanly designed interfaces for vertex data sources and 'compiled' VAOs, where one can encapsulate the VAO factory appropriately. This factory can then be altered to initialize interleaved, separate or mixed vertex buffer layouts from data sources, without breaking anything. This is especially useful for profiling.

After all that babbling, my advice is simple: Proper and sufficiently abstracted design before and for optimization.

Whist answered 17/9, 2013 at 17:32 Comment(1)
It's nice to see an SO answer that says, "get your architecture right, and things will work out ideally in the end". Focused as I am on architecture and APIs on the general application front, this answer opened my eyes to how things need to be tackled re my OpenGL code. Thanks!Insidious
P
3

A VAO doesn't hold any vertex attribute data. It's a container object for a set of vertex arrays which describe how to pull data from zero, one or multiple buffer objects (these are the actual vertex arrays you define with VertexAtrribPointer()(pre-GL43) or VertexAttribFormat(), VertexAttribBinding() and BindVertexBuffer() (GL43+)), enable states for said vertex arrays and possibly an ELEMENT_ARRAY_BUFFER_BINDING. See tables 23.3 and 23.4 of the GL 4.4 core specification for details.

The ARRAY_BUFFER_BINDING is recorded separately for each vertex array, i.e. each VertexAttribPointer() invocation per attribute index. This way you can associate a an attribute index of the VAO with multiple buffer objects and switch between which buffers to pull from using {Enable|Disable}VertexAttribArray() or by distributing buffers across attrib indices and choosing appropriate attrib locations for your shaders - either with glBindAttribLocation() or using explicit attrib locations inside your shader (the latter is superior).

Why all this blabbering about VAOs? Because there is no detrimental effect of using VAOs and the layout of a VBOs buffer store and how quickly vertices are pulled has nothing to do with VAOs. VAOs are state containers, nothing more, nothing less. You still need buffer storage to back any vertex pulling, you can interleave your data just like you did without VAOs. All you need to do is reflect the interleaved memory layout with your vertex arrays. So in essence, except for recording vertex array state, nothing changes.

What you gain by using VAOs is a way to more or less quickly switch between sets of state and associated buffer objects without setting up vertex arrays everytime you switch a buffer object. You therefore save API calls. Also, when binding a VAO, each vertex array still has its ARRAY_BUFFER_BINDING and there is no need to call BindBuffer() again thus saving further API calls. That's it.

You don't gain nor do you lose anything in regards to vertex pulling performance because of a VAO, at least not in theory. You do, however, lose overall performance when you inconsiderately switch VAOs around like crazy.

BTW, using VAOs is also mandatory when using GL32 and higher core contexts so your question is moot if you're not going for compat.

In general, when you're unsure about performance: don't guess, always profile! That's especially true when using OpenGL.

Prison answered 17/9, 2013 at 16:6 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.