ToAscii/ToUnicode in a keyboard hook destroys dead keys
Asked Answered
L

10

21

It seems that if you call ToAscii() or ToUnicode() while in a global WH_KEYBOARD_LL hook, and a dead-key is pressed, it will be 'destroyed'.

For example, say you've configured your input language in Windows as Spanish, and you want to type an accented letter á in a program. Normally, you'd press the single-quote key (the dead key), then the letter "a", and then on the screen an accented á would be displayed, as expected.

But this doesn't work if you call ToAscii() or ToUnicode() in a low-level keyboard hook function. It seems that the dead key is destroyed, and so no accented letter á shows up on screen. Removing a call to the above functions resolves the issue... but unfortunately, I need to be able to call those functions.

I Googled for a while, and while a lot of people seemed to have this issue, no good solution was provided.

Any help would be much appreciated!

EDIT: I'm calling ToAscii() to convert the virtual-key code and scan code received in my LowLevelKeyboardProc hook function into the resulting character that will be displayed on screen for the user.

I tried MapVirtualKey(kbHookData->vkCode, 2), but this isn't as "complete" a function as ToAscii(); for example, if you press Shift + 2, you'll get '2', not '@' (or whatever Shift + 2 will produce for the user's keyboard layout/language).

ToAscii() is perfect... until a dead-key is pressed.

EDIT2: Here's the hook function, with irrelevant info removed:

LRESULT CALLBACK keyboard_LL_hook_func(int code, WPARAM wParam, LPARAM lParam) {

    LPKBDLLHOOKSTRUCT kbHookData = (LPKBDLLHOOKSTRUCT)lParam;
    BYTE keyboard_state[256];

    if (code < 0) {
        return CallNextHookEx(keyHook, code, wParam, lParam);
    }

    WORD wCharacter = 0;

    GetKeyboardState(&keyboard_state);
    int ta = ToAscii((UINT)kbHookData->vkCode, kbHookData->scanCode,
                     keyboard_state, &wCharacter, 0);

    /* If ta == -1, a dead-key was pressed. The dead-key will be "destroyed"
     * and you'll no longer be able to create any accented characters. Remove
     * the call to ToAscii() above, and you can then create accented characters. */

    return CallNextHookEx(keyHook, code, wParam, lParam);
}
Lichter answered 26/12, 2009 at 23:24 Comment(10)
Please show the code for the call to ToUnicode(). Where do you get the lpKeyState parameter?Ablution
@jdigital: Well, I'm personally calling ToAscii, but I've read it happens with ToUnicode, too. My call is basically ToAscii(kbHookData->vkCode, kbHookData->scanCode, GetKeyboardState(), &lpchar, 0) in the WH_KEYBOARD_LL hook function. When the ToAscii call returns -1, this means a dead key was pressed... and the dead key is 'destroyed' as I explained above.Lichter
I deleted my answer, it was completely incorrect based on the assumption that you were referring to c-runtime functions to convert strings rather than Win32 apis to convert scancodes... sorry.Gaga
I would help a lot if you could describe the purpose of doing this, we may find other solutions for your problem.Adjure
@Sorin: See the edit to my original question for more info. :-)Lichter
Can you post your entire hook function code? The problem could hide not in the ToAscii itself, but in the surrounding hook logic. Also, as a wild guess, try SetLastError(0) at the end of the hook function.Filicide
@atzz: I just added the hook function code. I believe it is ToAscii() that's the problem, because if I don't call it, the problem doesn't occur. Thanks for the SetLastError(0) suggestion, but unfortunately that didn't do anything.Lichter
You're missing the initialization for keyboard_state.Ablution
@jdigital: Oops. Fixed. GetKeyboardState() was called in my real code, though.Lichter
Did you find any solution_Sewel
C
4

It is known that ToUnicode() and its older counterpart ToAscii() can change keyboard state of the current thread and thus mess with dead keys and ALT+NUMPAD keystrokes:

As ToUnicodeEx translates the virtual-key code, it also changes the state of the kernel-mode keyboard buffer. This state-change affects dead keys, ligatures, alt+numpad key entry, and so on. It might also cause undesired side-effects if used in conjunction with TranslateMessage (which also changes the state of the kernel-mode keyboard buffer).

To avoid that you can do your ToUnicode() call in a separate thread (it will have a separate keyboard state) or use a special flag in wFlags param that is documented in ToUnicode() docs:

If bit 2 is set, keyboard state is not changed (Windows 10, version 1607 and newer)

Or you can prepare sc->char mapping table beforehand and update it on WM_INPUTLANGCHANGE language change event.

I think it should work with ToAscii() too but better not use this old ANSI codepage-dependant method. Use ToUnicode() API instead that can even return ligatures and UTF-16 surrogate pairs - if keyboard layout have them. Some do.

See Asynchronous input vs synchronous input, a quick introduction for the reason behind this.

Chorography answered 1/6, 2022 at 16:28 Comment(6)
This should be the accepted answer, even if it is only valid as of Windows 10. It's definitely the way to go for anyone looking for a solution now.Thirteenth
The part about a “call in a separate thread” is wrong. Even doing it in a separate application would not make it “non-destructive”. (Try: (A) open two applications (may be different!) with the same keyboad layout chosen. (B) Type a prefix key (=deadkey) in one. (D) Switch to the other application, and type a followup key — you would get the result of combining the deadkey+key.)Gadgeteer
To add insult to injury, you can even add: (C) Switch to a different keyboard layout (in the first application), and type something, (May include dead keys too!) This would not change the result in the second application.Gadgeteer
The part about the advantages of ToUnicode to ToAscii also seems wrong. AFAIK, the only difference is the limitation to ASCII range (and a possible CP conversion — do not remember…)Gadgeteer
See my discussion of non-destructive ToUnicode calls; a very short summary is at the end of this section. It was possible even before the bit 0x04 was introduced — with minor limitations/complications. (Compare with the 4th note in details of input-by-number.)Gadgeteer
I added an answer (with essentially the comment above as content).Gadgeteer
L
3

Quite an old thread. Unfortunately it didn't contain the answer I was looking for and none of the answers seemed to work properly. I finally solved the problem by checking the MSB of the MapVirtualKey function, before calling ToUnicode / ToAscii. Seems to be working like a charm:

if(!(MapVirtualKey(kbHookData->vkCode, MAPVK_VK_TO_CHAR)>>(sizeof(UINT)*8-1) & 1)) {
    ToAscii((UINT)kbHookData->vkCode, kbHookData->scanCode,
        keyboard_state, &wCharacter, 0);
}

Quoting MSDN on the return value of MapVirtualKey, if MAPVK_VK_TO_CHAR is used:

[...] Dead keys (diacritics) are indicated by setting the top bit of the return value. [...]

Lapidify answered 19/6, 2016 at 17:19 Comment(5)
could you please elaborate more? I don't get your code, what if you don't enter the if()? doing nothing?Fetlock
In case it's not entering the if, the current key pressed is a dead key. Depending on what you would like to achieve, you can react to it, or simply ignore it, in case you would like to only listen to non-dead key presses. Hope this helps.Lapidify
Thank you, I got itFetlock
Does not work for dead chars that use "ALT GR" key for example (right ALT, as it produces "CTRL + ALT" key message). Exemple of french keyboard layout trying to add accent this accent over a letter : ` = ìFloorer
This is wrong on at least two counts. First, MapVirtualKey inspects the binding of the physical key only on the “unshifted”-“un-AltGrayed” level. Second, even if “the current level” is “unshifted”-“un-AltGrayed”, the fact that by default this key is a dead key tells us nothing about how this key behaves now. If it is pressed after another dead key, it may be “again a dead key”, or it may terminate a “key chord”.Gadgeteer
A
2
  1. stop using ToAscii() and use ToUncode()
  2. remember that ToUnicode may return you nothing on dead keys - this is why they are called dead keys.
  3. Any key will have a scancode or a virtual key code but not necessary a character.

You shouldn't combine the buttons with characters - assuming that any key/button has a text representation (Unicode) is wrong.

So:

  • for input text use the characters reported by Windows
  • for checking button pressed (ex. games) use scancodes or virtual keys (probably virtual keys are better).
  • for keyboard shortcuts use virtual key codes.
Adjure answered 28/12, 2009 at 16:51 Comment(3)
I tried using ToUnicode(), but the same thing happens... the "dead-key" is destroyed, and so a double-accent is shown when I press the accent key, and I can't make any accented letters. Uninstalling the hook immediately fixes the issue. Not calling ToAscii() or ToUnicode() immediately fixes the issue as well.Lichter
Try to make a copy of keyboard state structure in order to prevent Windows from altering the original one.Adjure
This is almost completely unrelated to the question (except ToAscii being last-millenium) and wrong on many counts. For example, ToUnicode would return the information about the dead key (although making it non-destructive is tricky — see my answer). Second, there is no way to copy/affect the state of the keyboard layout except through ToUnicode(Ex)/ToAscii.Gadgeteer
P
2

I encountered this issue while creating a key logger in C# and none of the above answers worked for me.

After a deep blog searching, I stumbled across this keyboard listener which handles dead keys perfectly.

Pinter answered 27/10, 2019 at 8:44 Comment(0)
K
2

Here is a full code which covers dead keys and shortcut keys using ALT + NUMPAD, basically a full implementation of a TextField input handling:

    [DllImport("user32.dll")]
    public static extern int ToUnicode(uint virtualKeyCode, uint scanCode, byte[] keyboardState, [Out, MarshalAs(UnmanagedType.LPWStr, SizeConst = 64)] StringBuilder receivingBuffer, int bufferSize, uint flags);

    private StringBuilder _pressCharBuffer = new StringBuilder(256);
    private byte[] _pressCharKeyboardState = new byte[256];

    public bool PreFilterMessage(ref Message m)
    {
        var handled = false;

        if (m.Msg == 0x0100 || m.Msg == 0x0102)
        {

            bool isShiftPressed = (ModifierKeys & Keys.Shift) != 0;
            bool isControlPressed = (ModifierKeys & Keys.Control) != 0;
            bool isAltPressed = (ModifierKeys & Keys.Alt) != 0;
            bool isAltGrPressed = (ModifierKeys & Keys.RMenu) != 0;

            for (int i = 0; i < 256; i++)
                _pressCharKeyboardState[i] = 0;

            if (isShiftPressed)
                _pressCharKeyboardState[(int)Keys.ShiftKey] = 0xff;

            if (isAltGrPressed)
            {
                _pressCharKeyboardState[(int)Keys.ControlKey] = 0xff;
                _pressCharKeyboardState[(int)Keys.Menu] = 0xff;
            }

            if (Control.IsKeyLocked(Keys.CapsLock))
                _pressCharKeyboardState[(int)Keys.CapsLock] = 0xff;

            Char chr = (Char)0;

            int ret = ToUnicode((uint)m.WParam.ToInt32(), 0, _pressCharKeyboardState, _pressCharBuffer, 256, 0);

            if (ret == 0)
                chr = Char.ConvertFromUtf32(m.WParam.ToInt32())[0];
            if (ret == -1)
                ToUnicode((uint)m.WParam.ToInt32(), 0, _pressCharKeyboardState, _pressCharBuffer, 256, 0);
            else if (_pressCharBuffer.Length > 0)
                chr = _pressCharBuffer[0];

            if (m.Msg == 0x0102 && Char.IsWhiteSpace(chr))
                chr = (Char)0;


            if (ret >= 0 && chr > 0)
            {

            //DO YOUR STUFF using either "chr" as special key (UP, DOWN, etc..) 
            //either _pressCharBuffer.ToString()(can contain more than one character if dead key was pressed before)
            //and don't forget to set the "handled" to true, so nobody else can use the message afterwards

            }
        }

        return handled;
    }
Khotan answered 21/2, 2020 at 12:30 Comment(1)
This is wrong on many counts. (I do not know what 0x100 and 0x102 mean, so cannot comment in detail.) For example, for doubling deadkeys, see my comment to Blue eyes’s answer. For example, this code indiscriminately mixes VK_-codes and characters. For example, ToUnicode needs to process KeyUp events for Alt-NUMPAD input. For example, the keyboard layout may distinguish handed-variants of Ctrl/Alt. Etc… (I added a correct answer.)Gadgeteer
D
1

Call 'ToAscii' function twice for a correct processing of dead-key, like in:

int ta = ToAscii((UINT)kbHookData->vkCode, kbHookData->scanCode,
                 keyboard_state, &wCharacter, 0);
int ta = ToAscii((UINT)kbHookData->vkCode, kbHookData->scanCode,
                 keyboard_state, &wCharacter, 0);
If (ta == -1)
 ...
Downes answered 2/9, 2010 at 8:19 Comment(2)
This just 'kills' the dead key for me. E.g. 'ë' becomes 'e', but pressing a dead key twice still gives e.g. ¨¨Cf
Essentially, what you are saying is that a triple press of a dead key gives the same result as a simple press — provided one discards the string resulting from the second press. For many keys on many keyboard layouts this may work. However, this would not work for important dead keys on many layouts. (For example, think of how a Compose key may operate. On my layouts, double-Compose enters “another flavor of composing”.) — I wrote an answer with the correct approaches to this important question.Gadgeteer
Y
1

Calling the ToAscii or ToUnicode twice is the answer. I found this and converted it for Delphi, and it works!

cnt:=ToUnicode(VirtualKey, KeyStroke, KeyState, chars, 2, 0);
cnt:=ToUnicode(VirtualKey, KeyStroke, KeyState, chars, 2, 0); //yes call it twice
Yet answered 4/7, 2012 at 7:37 Comment(3)
Its a little better as my '^' char is displayed only after the second key press, as expected, but doesn't work because the accent doesn't go over the letter as it should produce 'ô', but only produces 'o'Floorer
Actually your answer is the same as @Blue eyes below :https://mcmap.net/q/607620/-toascii-tounicode-in-a-keyboard-hook-destroys-dead-keysFloorer
I addressed the problems with this approach in a comment to the Blue eyes answer.Gadgeteer
G
0

Before introduction of the bit 0x04 for the wFlag argument, doing non-destructive ToUnicode calls was a tricky topic. The referenced text discusses it (and the current state) in a lot of details. (Sometimes I suspect that it may have been this discussion which led to implementation of 0x04… But it was also suggested earlier.)

A very short summary “of the old way” is at the end of the referenced section: use wFlags=0x01|0x02 and detect/implement input-by-number yourself. (Compare with the 4th note in details of input-by-number.)

Gadgeteer answered 16/3 at 22:0 Comment(0)
P
-1

I copy the vkCode in a queue and do the conversion from another thread

@HOOKPROC
def keyHookKFunc(code,wParam,lParam):
    global gkeyQueue
    gkeyQueue.append((code,wParam,kbd.vkCode))
    return windll.user32.CallNextHookEx(0,code,wParam,lParam)

This has the advantage of not delaying key processing by the os

Papilloma answered 27/5, 2015 at 15:22 Comment(1)
In addition to having nothing to do with the original question, this has a disadvantage of being wrong. (For example, ToUnicode has state, so calls should happen in the correct order.)Gadgeteer
S
-1

This works for me

byte[] keyState = new byte[256];

//Remove this if using
//GetKeyboardState(keyState); 

//Add only the Keys you want
keysDown[(int)Keys.ShiftKey] = 0x80; // SHIFT down
keysDown[(int)Keys.Menu] = 0x80; // ALT down
keysDown[(int)Keys.ControlKey] = 0x80; // CONTROL down
  
//ToAscii should work fine         
if (ToAscii(myKeyboardStruct.VirtualKeyCode, myKeyboardStruct.ScanCode, keyState, inBuffer, myKeyboardStruct.Flags) == 1)
{
    //do something
}
Sewel answered 24/6, 2021 at 13:6 Comment(1)
First, this has nothing to do with the original question. Second, this will work in very limited set of situations only. (See my comment to Stefan Pintilie’s approach — which is much more elaborate so covers a bit more ground than yours…)Gadgeteer

© 2022 - 2024 — McMap. All rights reserved.