Unexpected sizes of arrays in a HLSL Constant Buffer
Asked Answered
F

1

7

I have not yet used more complicated CBs like this here but, from what I understand, my C++ alignment and packing has to match what HLSL expects. So I'm trying to figure out the rules so I can predictably lay out the C++ struct to match what HLSL expects.

I was doing some tests in a Vertex Shader v5 to see packing produced in the output and used this structure in the vs.hlsl:

cbuffer conbuf {
    float m0;
    float m1;
    float4 m2;
    bool m3[1];
    bool m4[4];
    float4 m5;
    float m6;
    float4 m7;
    matrix m8;
    float m9;
    float m10;
    float4 m11[2];
    float m12[8];
    float m13;
};

which produced the following output (in the Header File Name VC++ Project HLSL Settings):

cbuffer conbuf {
    float m0; // Offset: 0 Size: 4
    float m1; // Offset: 4 Size: 4
    float4 m2; // Offset: 16 Size: 16
    bool m3; // Offset: 32 Size: 4
    bool m4[4]; // Offset: 48 Size: 52
    float4 m5; // Offset: 112 Size: 16
    float m6; // Offset: 128 Size: 4
    float4 m7; // Offset: 144 Size: 16
    float4x4 m8; // Offset: 160 Size: 64
    float m9; // Offset: 224 Size: 4
    float m10; // Offset: 228 Size: 4
    float4 m11[2]; // Offset: 240 Size: 32
    float m12[8]; // Offset: 272 Size: 116
    float m13; // Offset: 388 Size: 4
};

I pretty much figured out how offsets work (based on sizes) but I cannot understand the array sizes.

Some array sizes in here seem random. I can't figure out how the bool m4[4] array has size: 52. Same for float m12[8] which is size: 116. How does the HLSL compiler manage to produce these sizes?

Any help? I've already looked on MSDN packing page but they don't say much about arrays.

Flagwaving answered 18/6, 2014 at 3:3 Comment(7)
How can we comment on the weird "size" (or for that matter offset) values you're showing when you haven't shown us the code that generated them?Annice
@TonyD The cbuffer in the .hlsl file is first. The one output in the Header File Name property of VC++ is the second. The VS does nothing: return pos * m1 * m10 * m4[2]; so I use a couple of values for the cbuffer not to be optimized away.Flagwaving
Padding problems?!? All of your size/offset assumptions may turn out completely wrong :P ...Lammers
@πάνταῥεῖ Don't know. The weird sized arrays start on proper offsets. So I assume padding is not included in their size, as the other more simple members don't have size affected by padding. If they turn out wrong, how can one predictably make the C++ struct to match HLSL?Flagwaving
@Flagwaving Of course s.th. like bool m4[4]; will force getting float4 m5; padded to appear at the next suitable 32 or 64 bit plain address. Forget about your offset assumptions, derive these from concrete sizeof queries! Here may be some hints how to determine the 'real' offsets: #18252315Lammers
@πάνταῥεῖ My question is about m4. I understand why m5 is where it is. As it's padded to the next 16-byte alignment after m4. So m5 is at m4.offset + m4.size + padding to(16). m4's size is what puzzles me.Flagwaving
Take care of byte alignment. It could be possible that he tries to read float4 values in one step. So, the first float4 has an offset about sizeof(float4). Try to realign all values starting with float4x4 to float4 to float to bool. Try to put 16 bytes in a block.Spinal
I
15

I'll simplify your example a little bit, since you already get padding.

One important part for arrays, as per Packing rules (link you mentioned) is :

Arrays are not packed in HLSL by default. To avoid forcing the shader to take on ALU overhead for offset computations, every element in an array is stored in a four-component vector.

So let's take this simple cbuffer:

cbuffer cbPerObj : register( b0 )
{
     float Alpha[4];
};

As per the above rule (each float is stored in four vector), this would be (almost) equivalent to:

cbuffer cbPerObj : register( b0 )
{
     float4 Alpha[4];
};

Or (expanded)

 cbuffer cbPerObj : register( b0 )
{
     float Alpha1;
     float3 Dummy1;
     float Alpha2;
     float3 Dummy2;
     float Alpha3;
     float3 Dummy3;
     float Alpha4;
};

As you can notice, your last element is not padded, this is why you can notice in your case:

bool m4[4]; // Offset: 48 Size: 52
float4 m5; // Offset: 112 Size: 16

m4 is 16*4 = 64 (minus the last 3), 64-12 = 52

You can also notice that of course, 48 + 52 = 100 (so since m5 needs not to cross boundary, you can find the 12 lost bytes for the offset)

In the case you had,

bool m4[4]; // Offset: 48 Size: 52
float m5;

Offset for m5 would be 100, since it can fit the boundary.

Hope that makes sense.

Impenitent answered 19/6, 2014 at 16:38 Comment(1)
This answer should go on the MSDN HLSL Packing page. It makes sense compared to what they wrote there... I figured how I can better pack them but I was really curious about those sizes. It's the lack of end padding. THANK YOU!Flagwaving

© 2022 - 2024 — McMap. All rights reserved.