Uploading texture very slow in OpenGL
Asked Answered
T

2

5

I have written an emulator which I am in the process of porting to Linux. At the moment to do the video I am using Direct3D 11, which I am porting to OpenGL (which I'm running on Windows for now). I render to a 1024x1024 texture which I upload to memory every frame (the original hardware doesn't really lend itself to modern hardware acceleration, so I just do it all in software). However, I have found that uploading the texture in OpenGL is a lot slower.

In Direct3D uploading the texture every frame drops the frame rate from 416 to 395 (a 5% drop). In OpenGL it drops from 427 to 297 (a 30% drop!).

Here's the relevant code from my draw function.

Direct3D:

D3D11_MAPPED_SUBRESOURCE resource;
deviceContext_->Map(texture, 0, D3D11_MAP_WRITE_DISCARD, 0, &resource);
uint32_t *buf = reinterpret_cast<uint32_t *>(resource.pData);
memcpy(buf, ...);
deviceContext_->Unmap(texture, 0);

OpenGL:

glBindTexture(GL_TEXTURE_2D, texture);
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, 1024, 1024, 0, GL_RGBA,
    GL_UNSIGNED_BYTE, textureBuffer);

Can anyone suggest what may be causing this slowdown?

If it makes any odds, I'm running Windows 7 x64 with an NVIDIA GeForce GTX 550 Ti.

Tobacco answered 16/10, 2012 at 20:26 Comment(0)
G
12

glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, 1024, 1024, 0, GL_RGBA, GL_UNSIGNED_BYTE, textureBuffer);

You're doing several things wrong here. First, glTexImage2D is the equivalent of creating a Direct3D texture resource every frame. But you're not creating it; you're just uploading to it. You should use glTexImage2D only once per mipmap layer of interest; after that, all uploading should happen with glTexSubImage2D.

Second, your internal format (third parameter from the left) is GL_RGBA. You should always use explicit sizes for your image formats. So use GL_RGBA8. This isn't really a problem, but you should get into the habit now.

Third, you're using GL_RGBA ordering for your pixel transfer format (the third parameter from the right, not the left). This is generally not the most optimal pixel transfer format, as lots of hardware tends to prefer GL_BGRA ordering. But if you're not getting your data from whatever is producing it in that order, then there's not much that can be done.

Fourth, if you have something else you can do between starting the upload and actually rendering with it, you can employ asynchronous pixel transfer operations. You write your data to a buffer object (which can be mapped, so that you don't have to copy into it). Then you use glTexSubImage2D to transfer this data to OpenGL. Because the source data and the destination image are part of OpenGL's memory, it doesn't have to copy the data out of client memory before glTexSubImage2D returns.

Granted, that's probably not going to help you much, since you're already effectively doing that copy in the D3D case.

In OpenGL it drops from 427 to 297 (a 30% drop!)

The more important statistic is that it's a 1 millisecond difference. You should look at your timings in absolute time, not in frames-per-second, nor in percentage drops of FPS.

Grad answered 16/10, 2012 at 20:53 Comment(3)
Thanks for the very helpful answer! glTexSubImage2D helped massively; it's up to 373 fps (2.7 ms) now. The BGRA thing is interesting. I actually have to use BGRA to get it to show the right colour; I had assumed it was something I had screwed up and was something I was going to look into later, so I just changed the code I posted. Oddly, changing from RGBA to BGRA drops the frame rate to 328 (3.0 ms); why might that happen?Tobacco
For what it's worth, rendering into a PBO and using it immediately, instead of rendering then copying increased the framerate to 424. Need to do some more optimisation on the D3D renderer now to catch up!Tobacco
@nichol-bolas "Fourth, if you have something else you can do between starting the upload and actually rendering with it" - Surely you can start rendering with it immediately? You don't need to do something else in the meantime. The driver will just process your rendering commands asynchronously, after the DMA transfer + pixel unpack operation has completed.Universality
M
8

glTexImage2d does memory reallocation as well as update. Try to use glTexSubImage2d instead.

Mansell answered 16/10, 2012 at 20:40 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.