remove Russian letters from a string in php
Asked Answered
T

3

4

How can i remove all Russian letters from a string in PHP ?
Or the opposite, i would like to keep only.
English letters, white space, numbers and all the signs like !@#$%^&*(){}":?><>~'"

How can i accomplish that, Thank you.

i figure it out, i replace all Russian cherecters with ### and then i substring from the start to the end.

$desc = preg_replace('/[а-я]+/iu','###', $desc);

$start = strpos ($desc,'###');
$end =strrpos ($desc,"###");

if($start!==false)
{
    $descStart = substr($desc,0,$start);
    $descEnd = substr($desc,$end+3);
    $desc = $descStart.$descEnd;
}
Twinned answered 19/8, 2012 at 12:7 Comment(1)
In Soviet Russia, the letters remove you!Peroxy
F
4
$string = 'тест тест Тест Обязателльно Stackoverflow >!<';
var_dump(preg_replace('/[\x{0410}-\x{042F}]+.*[\x{0410}-\x{042F}]+/iu', '', $string));

Input string must be in unicode, and output be in unicode too

Felicity answered 19/8, 2012 at 12:30 Comment(2)
Hi, i tried it and its work but, it keeps all numbers and ",." inside the Russian string that has ben removed. i want to substring the Russian part by getting the position of the first Russin latter and to get the position of the last Russian later. how can i do this ? thank youTwinned
try something like this /[\x{0410}-\x{042F}]+.*[\x{0410}-\x{042F}]+/iu, and after preg_replace you can trim spaces or something else in the begin of string (i am replace russian a and я to unicode)Felicity
F
2

The following regular expression will match letters in the Cyrrilic script: http://regex101.com/r/sO0uB7 (example based on Andrey Vorobyev's text)

I think this is what you're after.

I am unsure if the i modifier is necessary.

Farce answered 19/8, 2012 at 12:43 Comment(2)
I think @AndreyVorobyev needs a solution in PHP.Lombardo
Was not aware he had to be spoon-fed: preg_replace('/\p{Cyrillic}/', '', $str);Farce
M
0

My approach would first transliterate the string into ASCII (to keep as much information as possible) and then remove unallowed characters:

$url = iconv("utf-8", "us-ascii//TRANSLIT", $url);
$url = strtolower($url);
$url = preg_replace('~[^-a-z0-9_]+~', '', $url);

You will have to extend the regular expression at the end to match what you need.

Messinger answered 19/8, 2012 at 12:13 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.