...when used in patterns like "\\p{someCharacterClass}"
.
I've used/seen some:
- Lower
- Upper
- InCombiningDiacriticalMarks
- ASCII
What is the definitive list of all supported built-in character classed? Where is it documented? What are the exact meanings?
Edited...
There seem to be a lot of "RTFM" answers refering to the javadoc for Pattern
. That's the first place I looked before asking this question. Just so everyone is clear, the javadoc for Pattern makes no mention of any of the classes listed above.
The "correct" answer will mention "InCombiningDiacriticalMarks" somewhere on the page, and will not be some vague reference to "Unicode Standards".
Pattern
documentation? – CawUnicodeBlock.forName
which led to unicode.org, where I found Where can I find the definitive list of Unicode blocks? and finallyBlocks.txt
itself. – CawTags
class match? – ThanetBlocks.txt
file notes the code point range, so then get the code chart for that range: unicode.org/charts/PDF/UE0000.pdf (I don't know what those "Tags" are used for either.) – Caw