Win32 Wide-Character String Alignment Requirements
Asked Answered
S

1

6

I narrowed down a problem in my GUI code to SetWindowTextW(HWND, wchar_t *) silently failing if the new window title is not aligned to two bytes. In this case, SetWindowText() returns 1 (success) but does not set the new text.

The natural alignment of wchar_t on MSVC is 2 bytes, so this was definitely my error. But just to be sure I tried to find the alignment rules for Win32 strings.

I found no official documentation, just an old newsgroup thread mentioning a bug report for the Open Watcom compiler – which claims that Win32 and COM on Windows NT actually require 4-byte alignment! While this seemed outlandish to me, I noticed that MSVC does indeed align every wchar_t literal to four bytes, not two. You can actually make MSVC pack wchar_t constant strings more densely via alignas(2). Heap granularity in Win32 is also >=8 bytes.

If Win32 required four-byte alignment for wide-character strings (like the source claims) and API calls silently fail on wrong data alignment (like SetWindowText() does with 1-byte alignment), I feel like being in deep trouble.

Is there any official documentation stating the definitive alignment requirements for wide-character strings in Win32/COM? Is it two or four bytes?


Code that reproduces the problem:

#include <Windows.h>
#include <CommCtrl.h>
#include <cassert>

int main() {

  auto hWnd = CreateWindowW(WC_BUTTONW, L"original title", WS_OVERLAPPEDWINDOW, CW_USEDEFAULT, 0, CW_USEDEFAULT, 0, nullptr, nullptr, nullptr, nullptr);
  assert(hWnd);
  ShowWindow(hWnd, SW_SHOWDEFAULT);
  UpdateWindow(hWnd);

  // Set title (aligned string):
  auto alignedResult = SetWindowTextW(hWnd, L"aligned title");
  assert(alignedResult != 0);

  // Set title (unaligned string):
  char buffer[50];
  memcpy(&buffer[1], L"unaligned title", sizeof L"unaligned title");
  auto unalignedResult = SetWindowTextW(hWnd, reinterpret_cast<wchar_t*>(&buffer[1])); // undefined behavior but for simplicity
  assert(unalignedResult != 0); // success is reported but title didn’t change

  MSG msg;
  while (GetMessage(&msg, nullptr, 0, 0)) {
    TranslateMessage(&msg);
    DispatchMessage(&msg);
  }

  return 0;
}
Saury answered 5/2 at 21:9 Comment(20)
The Win32 ABI cannot impose stricter requirements on the alignment of data than a C compiler guarantees, which is apparently 2 on MSVC.Tearful
It's not a general Win32 requirement, seems to be specific to this API (it's implementation is complex if you disassemble) and maybe others, same behavior with SendMessage(WM_SETTEXT) btw. Maybe it used to work with old Windows versions. Works fine with DrawTextW for example, of SendOrPostMessage(MyMessage, ...(LPARAM)myString) which is received ok. Works fine with StringCchCopy, lstrcpy, wcscpy, etc. That would deserve a note in official documentation IMHOMayest
The newsgroup entry seems to be specific to COM/OLE. COM/OLE use BSTRs which are slightly different: They still encode strings using wchar_t characters but have a 32-bit length prefix, meaning that the character string gets 4-byte aligned by coincidence.Tearful
@YangXiaoPo-MSFT this snippet reproduces the issue: pastebin.com/YyvV3jGf Window title changes from “original title” to “aligned title”, but not to “unaligned title”. SetWindowText() still returns success, though. Tested on fully updated Windows Server 2019Saury
@Saury about your code snippet: it should be WC_BUTTONW instead of WC_BUTTONEntwistle
Instristingly if compiled as 32 bit binary, it works if the wide string is unaligned .Entwistle
It does work for x86 but x64. SetWindowTextW cannot explain this. I have submitted feedback at aka.ms/AAp0q3gDavison
It was resolved to a bug.Davison
@Davison Thanks a million! Any way for me to track progress? Feedback hub says Your account doesn't have access to this feedbackSaury
Unless otherwise specified, the Windows ABI requires that all data types are naturally aligned for their type. Some processors crash on misaligned data.Haar
It is internal. Now it has been replied with "Not Repro" but I don't know why it cannot be reproduced, while as @RaymondChen said, here must be alignment. Isn't reinterpret_cast<wchar_t*>(&buffer[1]) aligned as 2? Why it works on x86 but x64.Davison
@RaymondChen we all assume this, but out of confusion over the behavior of x64 SetWindowText() I was asking if the alignment was officially documented somewhere. Someone mentioned "Using the Windows Headers" in a deleted answer, but there was surprisingly large confusion over the alignment of characters in structures vs. the alignment of strings on the stack. Hence I only answered after finding the CHARFORMATW precedentSaury
@Davison Because x64 uses xmm registers which require aligned data.Haar
@Saury the existence of the UNALIGNED macro implies that alignment is assumed by default. It is so obvious that nobody thought to write it down. Like "where did it say that when I pass a pointer, it has to be a valid pointer?"Haar
@RaymondChen Sorry but I don’t get the xmm hint. Doesn’t it usually require 16-B alignment, not 2-B? Which xmm instruction would succeed only with a source address aligned to 2 B?! (Yeah we’re in undefined behavior and all bets are off but your comment is oddly specific.) Or does the x64 version test for alignment and on error returns success anyway just because it can?Saury
Bulk string copies are done with xmm registers. Since wchar_t cannot start at odd addresses, there are only 8 possible starting points. If you pass an odd address, the code doesn't handle that case and weird things happen.Haar
It could be specific to this API as @SimonMourier said. Also works fine with MessageBoxW.Davison
@RaymondChen Thank you so much! I fixed my alignment. Another place where this happens, in case it ever helps someone, is the Windows HID API.Saury
@Davison More precisely it "seems to work fine" with MessageBoxW. But you are operating out of spec, and it is not guaranteed to work.Haar
FWIW, I changed my answer from “it’s a bug” to “it’s me hitting unspecified behavior”. I didn’t have the time to dive into the UNALIGNED macro yet, so feel free to further edit my answer.Saury
S
1

Win32 Wide Character string alignment is two bytes.

The article Using the Windows Headers contains a section Controlling Structure Packing that interlocks Win32 type alignment with that of a C compiler. This boils down to two-byte alignment for wide-character strings.

The discussion linked in the question does not apply because it is about BSTRs. These have a different memory layout from wide-character strings.

The SetWindowTextW() problem is me hitting unspecified behavior because I violated a fundamental API rule by passing unaligned data.

Aside from documentation, you can find places in the Windows headers that set the precedent. The CHARFORMATW structure, for example, is four-byte aligned but contains a two-byte aligned wide-character string.

Saury answered 26/2 at 20:45 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.