How do I match French and Russian Cyrillic alphabet characters with a regular expression? I only want to do the alpha characters, no numbers or special characters. Right now I have
[A-Za-z]
How do I match French and Russian Cyrillic alphabet characters with a regular expression? I only want to do the alpha characters, no numbers or special characters. Right now I have
[A-Za-z]
It depends on your regex flavor. If it supports Unicode character classes (like .NET, for instance), \p{L}
matches a letter character (in any character set).
If your regex
flavor supports Unicode blocks ([\p{IsCyrillic}]
), you can match Cyrillic characters with:
[\p{IsCyrillic}] or [\p{Cyrillic}]
Otherwise try using:
[U+0400–U+04FF]
For PHP
use:
[\x{0400}-\x{04FF}]
Explanation:
[\p{IsCyrillic}]
Match a character from the Unicode block "Cyrillic" (U+0400–U+04FF) «[\p{IsCyrillic}]»
Note:
Unicode Characters list and Numeric HTML Entities of [U+0400–U+04FF]
.
php
try using [\x{0400}-\x{04FF}]
instead. regex101.com/r/zcRenT/1 –
Frederico \p{Cyrillic}
, you just need to make sure to add a u flag onto the regex –
Apply It depends on your regex flavor. If it supports Unicode character classes (like .NET, for instance), \p{L}
matches a letter character (in any character set).
To match only Russian Cyrillic characters use:
[\u0401\u0451\u0410-\u044f]
which is the equivalent of:
[ЁёА-я]
where А
is Cyrillic, not Latin. (Despite looking the same they have different codes)
\p{IsCyrillic}
, \p{Cyrillic}
, [\u0400-\u04FF]
which others suggested will match all variants of Cyrillic, not only Russian
If you use modern PHP version - just:
preg_match("/^[\p{L}]+$/u");
Don't forget the u flag for unicode support!
Бори́с
but it does not match, so your regex does not work. –
Phenomena Regex to match cyrillic alphabets with normal(english) alphabets :
^[A-Za-z.!@?#"$%&:;() *\+,\/;\-=[\\\]\^_{|}<>\u0400-\u04FF]*$
It matches special chars,cyrillic alphabets,english alphabets.
Various regex dialects use [:alpha:]
for any alphanumeric character in the current locale. (You may need to put that in a character class, e.g. [[:alpha:]]
.)
[[:lower:]]
and [[:upper:]]
for matching specific case. E.g. replace lower case characters: regexp_replace(firstname, '[[:lower:]]', 'a', 'g')
. –
Kitchenmaid this worked for me
[a-z\u0400-\u04FF]
[\u0400-\u04FF]
–
Pettifogger If you use Elixir:
String.match?(string, ~r/^\p{Cyrillic}*$/u)
You need to add the u
flag for unicode support.
true
for empty String: String.match?("", ~r/^\p{Cyrillic}*$/u)
=> true
. You should change *
modifier for +
to fix that. –
Slipshod You can use the first and the last letter. For example in Bulgarian:
[А-я]+
For modern PHP (source):
$string = 'тест тест Тест Обязателльно Stackoverflow >!<';
var_dump(preg_replace('/[\x{0410}-\x{042F}]+.*[\x{0410}-\x{042F}]+/iu', '', $string));
In Java to match Cyrillic letters and space use the following pattern
^[\p{InCyrillic}\s]+$
© 2022 - 2024 — McMap. All rights reserved.
[ЁёА-я]
(whereА
is Russian). The unicode code for Russianа
is right afterЯ
, so you don't need 2 ranges. The unicode codes forЁё
is not betweenА-я
so you need to specify Ёё separately. – Stefaniastefanie