I want to disallow certain UTF-8 input (server-side), e.g. eastern languages, where example input might be " 伊 ".
However, I do want to continue supporting other latin or "latin-like" characters, such as the welsh ŵ and ŷ, so checking against latin-1 is not possible.
What are my options? (if language specific, PHP preferred)
Thanks very much.
Reasoning: browser support for a lot of non-western characters is often missing (e.g. on a different browser I just see a box in the question above), so for things like display names sometimes it's appropriate to restrict it even if it's not appropriate for message bodies
iconv
the string to the target encoding, discarding all invalid characters. – Deodar