Quoting from the JavaDoc of java.util.regex.Pattern.
Unicode support
This class is in conformance with
Level 1 of Unicode Technical Standard #18: Unicode Regular Expression Guidelines, plus RL2.1 Canonical Equivalents.
Unicode escape sequences such as
\u2014 in Java source code are
processed as described in §3.3 of the
Java Language Specification. Such
escape sequences are also implemented
directly by the regular-expression
parser so that Unicode escapes can be
used in expressions that are read from
files or from the keyboard. Thus the
strings "\u2014" and "\\u2014", while
not equal, compile into the same
pattern, which matches the character
with hexadecimal value 0x2014.
Unicode blocks and categories are
written with the \p and \P constructs
as in Perl. \p{prop} matches if the
input has the property prop, while
\P{prop} does not match if the input
has that property. Blocks are
specified with the prefix In, as in
InMongolian. Categories may be
specified with the optional prefix Is:
Both \p{L} and \p{IsL} denote the
category of Unicode letters. Blocks
and categories can be used both inside
and outside of a character class.
The supported categories are those of
The Unicode Standard in the version
specified by the Character class. The
category names are those defined in
the Standard, both normative and
informative. The block names supported
by Pattern are the valid block names
accepted and defined by
UnicodeBlock.forName.