Is there any way to detect Chinese characters using Perl? And is there any way on how to split Chinese characters with symbol dot '.' perfectly?
Detect chinese character using perl?
Depends on your particular notion of what is a Chinese character. Perhaps you're looking for /\p{Script=Hani}/
, but if we want to cast our net wide, the following regex pattern will match stuff that occurs in Chinese writing. Restrict if necessary.
use 5.014;
/
(?: \p{Block=CJK_Compatibility}
| \p{Block=CJK_Compatibility_Forms}
| \p{Block=CJK_Compatibility_Ideographs}
| \p{Block=CJK_Compatibility_Ideographs_Supplement}
| \p{Block=CJK_Radicals_Supplement}
| \p{Block=CJK_Strokes}
| \p{Block=CJK_Symbols_And_Punctuation}
| \p{Block=CJK_Unified_Ideographs}
| \p{Block=CJK_Unified_Ideographs_Extension_A}
| \p{Block=CJK_Unified_Ideographs_Extension_B}
| \p{Block=CJK_Unified_Ideographs_Extension_C}
)
/x;
Yes, .
matches one character. The empty pattern for split DWYM:
use utf8;
split //, '冰淇淋'
# returns ('冰', '淇', '淋')
How about if 冰.淇. ,but I just want to split out the last dot not all dot in the whole words? –
Indevout
PerlDoc page on this technique: perldoc.perl.org/… –
Isoline
© 2022 - 2024 — McMap. All rights reserved.