How Can I Make oninput() Trigger Only When Chinese Characters, Not IME Keystrokes, Are Entered?

Asked 22/12, 2018 at 6:18 Answered 7/8, 2023 at 10:6

I have a basic <input type="text" oninput="funct()"></input>.

However, when I type in Chinese, oninput is also triggered by the IME inputs, not just the resulting characters. For example, when I type "我們" with my Pinyin IME, and my function funct() displays console.log(WHAT_I_TYPED), the console reads:

w
wo
wom
wome
women
我們

I want it only to read "我們". However, I don't want to modify the text in the function, since there are too many Chinese IME's for that to be feasible.

Selfabsorbed answered 22/12, 2018 at 6:18 Comment(2)

Not sure if it will help, but maybe your function could discard any code points not inside the chinese character ranges: #1366568 – Heelpiece 22/12, 2018 at 6:27

That would help for Chinese, though I was kind of hoping for a more universal solution, since I also have Japanese and other languages to consider. – Selfabsorbed 22/12, 2018 at 9:42

Let's say your oninput function looks like this (a very simple example):

function onInputHandler(value) {
    console.log(value);
}

Now, I found this answer, and I modified the function slightly to deal with sentences/multiple letters:

function letter(phrase) {
    var output = false;
    var chars = phrase.split("");
    chars.forEach(c => {if (!c.toUpperCase() != c.toLowerCase()) output = true});
    return output;
}

And we can implement that into our oninput handler like so:

function onInputHander(value) {
    if (!letter(value) {
        console.log(value);
    }
}

So it will print out the value as long as it is not a letter. Here's a demonstration:

function letter(phrase) {
    var output = false;
    var chars = phrase.split("");
    chars.forEach(c => {if (!c.toUpperCase() != c.toLowerCase()) output = true});
    return output;
}

function onInputHandler(value) {
    if (!letter(value)) {
        console.log(value);
    }
}

<input type="text" oninput="onInputHandler(this.value)">

But you'll notice in the above snippet, it also prints out whitespace - try it! I added another condition to the if statement in onInputHandler:

if (!isLetter(value) && value.trim() != "") {
    console.log(value);
}

Now this will not allow any characters which have differing uppercase/lowercase values (most alphabets, including Latin (English), Greek, and Cyrillic (Russian).

However, if you want to do it another way, and check if each character in the string is a Chinese character, you could look at this question and its answers. Here's an example of a regular expression you could construct for Chinese characters (not perfect, but it covers most of the characters):

const chineseCharacterRegex = /[\u4e00-\u9fff]|[\u3400-\u4dbf]|[\u{20000}-\u{2a6df}]|[\u{2a700}-\u{2b73f}]|[\u{2b740}-\u{2b81f}]|[\u{2b820}-\u{2ceaf}]|[\uf900-\ufaff]|[\u3300-\u33ff]|[\ufe30-\ufe4f]|[\uf900-\ufaff]|[\u{2f800}-\u{2fa1f}]/u;

And here's a function that checks if all characters in a string match that regex:

function chineseChar(phrase) {
    var output = true;
    var chars = phrase.split("");
    chars.forEach(c => {if (!c.match(chineseCharacterRegex)) output = false);
    return output;
}

Now we can implement this in our function like so:

const chineseCharacterRegex = /[\u4e00-\u9fff]|[\u3400-\u4dbf]|[\u{20000}-\u{2a6df}]|[\u{2a700}-\u{2b73f}]|[\u{2b740}-\u{2b81f}]|[\u{2b820}-\u{2ceaf}]|[\uf900-\ufaff]|[\u3300-\u33ff]|[\ufe30-\ufe4f]|[\uf900-\ufaff]|[\u{2f800}-\u{2fa1f}]/u;
function onInputHandler(value) {
    if (chineseChar(value)) {
        console.log(value);
    }
}

Note: I do know the regex doesn't match all the Chinese characters, but this answer tells you why, and you can find most of the other character unicodes by reading the answers and following the links in this thread.

Hopefully this helps!

Edrick answered 22/12, 2018 at 7:0 Comment(1)

This letter example is a good answer, but it won't work with Zhuyin Fuhao, since the uppercase and lowercase inputs are identical there. For the other example, I'd have to make sure that the missing Hanzi are extremely rare (not too hard to check). However, I was hoping to use this for any language that doesn't display characters (Chinese, Japanese, and Korean) until certain inputs are given. – Selfabsorbed 22/12, 2018 at 9:44

Instead of trying to trigger input only for non-IME characters, I'd suggest to use event.isComposing attribute to identify whether the input data are inserted during IME composition or not.

inputElement.addEventListener("input", (event) => {
    if (!event.isComposing) {
        // Do what you need to do with non-IME input
    }
});

For more details about how the lifecycle looks like (and when the isComposing attribute is true and when false), take a look at Key Events During Composition specification.

Unilocular answered 7/8, 2023 at 10:6 Comment(0)

Note: I do know the regex doesn't match all the Chinese characters, but this answer tells you why, and you can find most of the other character unicodes by reading the answers and following the links in this thread.

Recommended topics

Hot tags